You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by Rajini Sivaram <ra...@gmail.com> on 2017/02/17 17:05:33 UTC

[DISCUSS] KIP-124: Request rate quotas

Hi all,

I have just created KIP-124 to introduce request rate quotas to Kafka:

https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+Request+rate+quotas

The proposal is for a simple percentage request handling time quota that
can be allocated to *<client-id>*, *<user>* or *<user, client-id>*. There
are a few other suggestions also under "Rejected alternatives". Feedback
and suggestions are welcome.

Thank you...

Regards,

Rajini

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by radai <ra...@gmail.com>.
i dont think time/cpu% are easy to reason about. most user-facing quota
systems i know (especially the commercial ones) focus on things users
understand better - iops and bytes.

as for quotas and "overhead" requests like heartbeats - on the one hand
subjecting them to the quota may cause clients to die out. on the other not
subjecting them to the quota opens the broker up to DOS attacks. how about
giving overhead requests their own quota, separate from "real"
(user-initiated?) requests? slightly more complicated but i think solves
the issue?

how long are requests held in purgatory? wouldnt this, at some point, still
cause resources to be taken? wouldnt it be better (for high enough delay
values) to just return an error to the client (quota exceeded, try again in
3 seconds)?

how would these work across an entire cluster? if these are enforced
independently on every single broker you'd be hitting "monotonous" clients
(who interact with fewer partitions) much harder than clients who operate
across a lot of partitions.

On Thu, Feb 23, 2017 at 8:02 AM, Ismael Juma <is...@juma.me.uk> wrote:

> Thanks for the KIP, Rajini. This is a welcome improvement and the KIP page
> covers it well. A few comments:
>
> 1. Can you expand a bit on the motivation for throttling requests that fail
> authorization for ClusterAction? Under what scenarios would this help?
>
> 2. I think we should rename `throttle_time_ms` in the new version of
> produce/fetch response to make it clear that it refers to the byte rate
> throttling. Also, it would be good to include the updated schema for the
> responses (we typically try to do that whenever we update protocol APIs).
>
> 3. I think I am OK with using absolute units, but I am not sure about the
> argument why it's better than a percentage. We are comparing request
> threads to CPUs, but they're not the same as increasing the number of
> request threads doesn't necessarily mean that the server can cope with more
> requests. In the example where we double the number of threads, all the
> existing users would still have the same capacity proportionally speaking
> so it seems intuitive to me. One thing that would be helpful, I think, is
> to describe a few scenarios where the setting needs to be adjusted and how
> users would go about doing it.
>
> 4. I think it's worth mentioning that TLS increases the load on the network
> thread significantly and for cases where there is mixed plaintext and TLS
> traffic, the existing byte rate throttling may not do a great job. I think
> it's OK to tackle this in a separate KIP, but worth mentioning the
> limitation.
>
> 5. We mention DoS attacks in the document. It may be worth mentioning that
> this mostly helps with clients that are not malicious. A malicious client
> could generate a large number of connections to counteract the delays that
> this KIP introduces. Kafka has connection limits per IP today, but not per
> user, so a distributed DoS could bypass those. This is not easy to solve at
> the Kafka level since the authentication step required to get the user may
> be costly enough that the brokers will eventually be overwhelmed.
>
> 6. It's unfortunate that the existing byte rate quota configs use
> underscores instead of dots (like every other config) as separators. It's
> reasonable for `io_thread_units` to use the same convention as the byte
> rate configs, but it's not great that we are adding to the inconsistency. I
> don't have any great solutions apart from perhaps accepting the dot
> notation for all these configs as well.
>
> Ismael
>
> On Fri, Feb 17, 2017 at 5:05 PM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > I have just created KIP-124 to introduce request rate quotas to Kafka:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> > Request+rate+quotas
> >
> > The proposal is for a simple percentage request handling time quota that
> > can be allocated to *<client-id>*, *<user>* or *<user, client-id>*. There
> > are a few other suggestions also under "Rejected alternatives". Feedback
> > and suggestions are welcome.
> >
> > Thank you...
> >
> > Regards,
> >
> > Rajini
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Ismael Juma <is...@juma.me.uk>.
Thanks for the KIP, Rajini. This is a welcome improvement and the KIP page
covers it well. A few comments:

1. Can you expand a bit on the motivation for throttling requests that fail
authorization for ClusterAction? Under what scenarios would this help?

2. I think we should rename `throttle_time_ms` in the new version of
produce/fetch response to make it clear that it refers to the byte rate
throttling. Also, it would be good to include the updated schema for the
responses (we typically try to do that whenever we update protocol APIs).

3. I think I am OK with using absolute units, but I am not sure about the
argument why it's better than a percentage. We are comparing request
threads to CPUs, but they're not the same as increasing the number of
request threads doesn't necessarily mean that the server can cope with more
requests. In the example where we double the number of threads, all the
existing users would still have the same capacity proportionally speaking
so it seems intuitive to me. One thing that would be helpful, I think, is
to describe a few scenarios where the setting needs to be adjusted and how
users would go about doing it.

4. I think it's worth mentioning that TLS increases the load on the network
thread significantly and for cases where there is mixed plaintext and TLS
traffic, the existing byte rate throttling may not do a great job. I think
it's OK to tackle this in a separate KIP, but worth mentioning the
limitation.

5. We mention DoS attacks in the document. It may be worth mentioning that
this mostly helps with clients that are not malicious. A malicious client
could generate a large number of connections to counteract the delays that
this KIP introduces. Kafka has connection limits per IP today, but not per
user, so a distributed DoS could bypass those. This is not easy to solve at
the Kafka level since the authentication step required to get the user may
be costly enough that the brokers will eventually be overwhelmed.

6. It's unfortunate that the existing byte rate quota configs use
underscores instead of dots (like every other config) as separators. It's
reasonable for `io_thread_units` to use the same convention as the byte
rate configs, but it's not great that we are adding to the inconsistency. I
don't have any great solutions apart from perhaps accepting the dot
notation for all these configs as well.

Ismael

On Fri, Feb 17, 2017 at 5:05 PM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Hi all,
>
> I have just created KIP-124 to introduce request rate quotas to Kafka:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> Request+rate+quotas
>
> The proposal is for a simple percentage request handling time quota that
> can be allocated to *<client-id>*, *<user>* or *<user, client-id>*. There
> are a few other suggestions also under "Rejected alternatives". Feedback
> and suggestions are welcome.
>
> Thank you...
>
> Regards,
>
> Rajini
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
I have updated the KIP to use request rates instead of request processing
time,

I have removed all requests that require ClusterAction permission
(LeaderAndIsr and UpdateMetdata as well in addition to stop/shutdown). But
I have left Metadata request in. Quota windows which limit the maximum
delay tend to be small (1 second by default) compared to request timeout or
max.block.ms and even the existing byte rate quotas can impact the time
taken to fetch metadata if the metadata request is queued behind a produce
request (for instance). So I don't think clients will need any additional
exception handling code for request rate quotas beyond what they already
need for byte rate quotas. Clients can flood the broker with metadata
requests (eg. producer with retry.backoff.ms=0 sending a message to a
non-existent topic), so it makes sense to throttle metadata requests.


Thanks,

Rajini

On Mon, Feb 20, 2017 at 11:55 AM, Dong Lin <li...@gmail.com> wrote:

> Hey Rajini,
>
> Thanks for the explanation. I have some follow up questions regarding the
> types of requests that will be covered by this quota. Since this KIP focus
> only on throttling the traffic between client and broker and client never
> sends LeaderAndIsrRequest to broker, should we exclude LeaderAndIsrRequest
> from this KIP?
>
> Besides, I am still not sure we should throttle MetadataUpdateRequeset. The
> benefits of throttling MetadataUpdateRequest seems little since it doesn't
> increase with user traffic. Client only sends MetadataUpdateRequest when
> there is partition leadership change or when client metadata has expired.
> On the other hand, if we throttle MetadataUpdateRequest, there is chance
> that MetadataUpdateRequest doesn't get update in time and user may receive
> exception. This seems like a big interface change because user will have to
> change application code to handle such exception. Note that the current
> rate-based quota will reduce traffic without throwing any exception to
> user.
>
> Anyway, I am looking forward to the updated KIP:)
>
> Thanks,
> Dong
>
> On Mon, Feb 20, 2017 at 2:43 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Dong, Onur & Becket,
> >
> > Thank you all for the very useful feedback.
> >
> > The choice of request handling time as opposed to request rate was based
> on
> > the observation in KAFKA-4195
> > <https://issues.apache.org/jira/browse/KAFKA-4195> that request rates
> may
> > be less intuitive to configure than percentage utilization. But since the
> > KIP is measuring time rather than request pool utilization as suggested
> in
> > the JIRA, I agree that request rate would probably work better than
> > percentage. So I am inclined to change the KIP to throttle on request
> rates
> > (e.g 100 requests per second) rather than percentage. Average request
> rates
> > are exposed as metrics, so admin can configure quotas based on that. And
> > the values are more meaningful from the client application point of
> view. I
> > am still interested in feedback regarding the second rejected alternative
> > that throttles based on percentage utilization of resource handler pool.
> > That was the suggestion from Jun/Ismael in KAFKA-4195, but I couldn't see
> > how that would help in the case where a small number of connections
> pushed
> > a continuous stream of short requests. Suggestions welcome.
> >
> > Responses to other questions above:
> >
> > - (Dong): The KIP proposes to throttle most requests (and not just
> > Produce/Fetch) since the goal is to control usage of broker resources. So
> > LeaderAndIsrRequest and MetadataRequest will also be throttled. The few
> > requests not being throttled are timing-sensitive.
> >
> > - (Dong): The KIP does not propose to throttle inter-broker traffic based
> > on request rates. The most frequent requests in inter-broker traffic are
> > fetch requests and a well configured broker would use reasonably good
> > values of min.bytes and max.wait that avoids overloading the broker
> > unnecessarily with fetch requests. The existing byte-rate based quotas
> > should be sufficient in this case.
> >
> > - (Onur): Quota window configuration - this is the existing configuration
> > quota.window.size.seconds (also used for byte-rate quotas)
> >
> > - (Becket): The main issue that the KIP is addressing is clients flooding
> > the broker with small requests (eg. fetch with max.wait.ms=0), which can
> > overload the broker and delay requests from other clients/users even
> though
> > the byte rate is quite small. CPU quota reflects the resource usage on
> the
> > broker that the KIP is attempting to limit. Since this is the time on the
> > local broker, it shouldn't vary much depending on acks=-1 etc. but I do
> > agree on the unpredictability of time based quotas. Switching from
> request
> > processing time to request rates will address this. Would you still be
> > concerned that "*Users do not have direct control over the request rate,
> > i.e. users do **not know when a request will be sent by the clients*"?
> >
> > Jun/Ismael,
> >
> > I am interested in your views on request rate based quotas and whether we
> > should still consider utilization of the resource handler pool.
> >
> >
> > Many thanks,
> >
> > Rajini
> >
> >
> > On Sun, Feb 19, 2017 at 11:54 PM, Becket Qin <be...@gmail.com>
> wrote:
> >
> > > Thanks for the KIP, Rajini,
> > >
> > > If I understand correctly the proposal was essentially trying to quota
> > the
> > > CPU usage (that is probably why time slice is used instead of request
> > rate)
> > > while the existing quota we have is for network bandwidth.
> > >
> > > Given we are trying to throttle both CPU and Network, that implies the
> > > following patterns for the clients:
> > > 1. High CPU usage, high network usage.
> > > 2. High CPU usage, low network usage.
> > > 3. Low CPU usage, high network usage.
> > > 4. Low CPU usage, low network usage
> > >
> > > Theoretically the existing quota addresses case 3 & 4. And this KIP
> seems
> > > trying to address case 1 & 2. However, it might be helpful to
> understand
> > > what we want to achieve with CPU and network quotas.
> > >
> > > People mainly use quota for two different purposes:
> > > a) protecting the broker from misbehaving clients, and
> > > b) resource distribution for multi-tenancy.
> > >
> > > I agree that generally speaking CPU time is a suitable metric to quota
> on
> > > for CPU usage and would work for a). However, as Dong and Onur noticed,
> > it
> > > is not easy to quantify the impact for the end users at application
> level
> > > with a throttled CPU time. If the purpose of the CPU quota is only for
> > > protection, maybe we don't need a user facing CPU quota.
> > >
> > > That said, a user facing CPU quota could be useful for virtualization,
> > > which maybe related to multi-tenancy but is a little different. Imagine
> > > there are 10 services sharing the same physical Kafka cluster. With CPU
> > > time quota and network bandwidth quota, each service can provision a
> > > logical Kafka cluster with some reserved CPU time and network
> bandwidth.
> > > And in this case the quota will be on per logic cluster. Not sure if
> this
> > > is what the KIP is intended in the future, though. It would be good if
> > the
> > > KIP can be more clear on what exact scenarios the CPU quota is trying
> to
> > > address.
> > >
> > > As of the request rate quota, while it seems easy to enforce and
> > intuitive,
> > > there are some caveats.
> > > 1. Users do not have direct control over the request rate, i.e. users
> do
> > > not known when a request will be sent by the clients.
> > > 2. Each request may require different amount of CPU resources to
> handle.
> > > That may depends on many things, e.g. whether acks = 1 or acks = -1,
> > > whether a request is addressing 1000 partitions or 1 partition,
> whether a
> > > fetch request requires message format down conversion or not, etc.
> > > So the result of using request rate quota could be quite unpredictable.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Sat, Feb 18, 2017 at 9:35 PM, Dong Lin <li...@gmail.com> wrote:
> > >
> > > > I realized the main concern with this proposal is how user can
> > interpret
> > > > this CPU-percentage based quota. Since this quota is exposed to user,
> > we
> > > > need to explain to user how this quota is going to impact their
> > > application
> > > > performance and convince them that the quota is now too low for their
> > > > application. We can able to do this with byte-rate based quota. But I
> > am
> > > > not sure how we can do this with CPU-percentage based quota. For
> > example,
> > > > how is user going to understand whether 1% CPU is OK?
> > > >
> > > > On Fri, Feb 17, 2017 at 10:11 AM, Onur Karaman <
> > > > onurkaraman.apache@gmail.com
> > > > > wrote:
> > > >
> > > > > Overall a big fan of the KIP.
> > > > >
> > > > > I'd have to agree with Dong. I'm not sure about the decision of
> using
> > > the
> > > > > percentage over the window as opposed to request rate. It's pretty
> > hard
> > > > to
> > > > > reason about. I just spoke to one of our SRE's and he agrees.
> > > > >
> > > > > Also I may have missed it, but I couldn't find information in the
> KIP
> > > on
> > > > > where this window would be configured.
> > > > >
> > > > > On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin <li...@gmail.com>
> > wrote:
> > > > >
> > > > > > To correct the typo above: It seems to me that determination of
> > > request
> > > > > > rate is not any more difficult than determination of *byte* rate
> as
> > > > both
> > > > > > metrics are commonly used to measure performance and provide
> > > guarantee
> > > > to
> > > > > > user.
> > > > > >
> > > > > > On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <li...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > Hey Rajini,
> > > > > > >
> > > > > > > Thanks for the KIP. I have some questions:
> > > > > > >
> > > > > > > - I am wondering why throttling based on request rate is listed
> > as
> > > a
> > > > > > > rejected alternative. Can you provide more specific reason why
> it
> > > is
> > > > > > > difficult for administrators to decide request rates to
> allocate?
> > > It
> > > > > > seems
> > > > > > > to me that determination of request rate is not any more
> > difficult
> > > > than
> > > > > > > determination of request rate as both metrics are commonly used
> > to
> > > > > > measure
> > > > > > > performance and provide guarantee to user. On the other hand,
> the
> > > > > > > percentage of processing time provides a vague guarantee to
> user.
> > > For
> > > > > > > example, what performance can user expect if you provide 1%
> > > > processing
> > > > > > time
> > > > > > > quota to this user? How is administrator going to decide this
> > > quota?
> > > > > > Should
> > > > > > > Kafka administrator continues to reduce this percentage quota
> as
> > > > number
> > > > > > of
> > > > > > > users grow?
> > > > > > >
> > > > > > > - The KIP suggests that LeaderAndIsrRequest and MetadataRequest
> > > will
> > > > > also
> > > > > > > be throttled by this quota. What is the motivation for
> throttling
> > > > these
> > > > > > > requests? It is also inconsistent with rate-based quota which
> is
> > > only
> > > > > > > applied to ProduceRequest and FetchRequest. IMO it will be
> > simpler
> > > to
> > > > > > only
> > > > > > > throttle ProduceRequest and FetchRequest.
> > > > > > >
> > > > > > > - Do you think we should also throttle the inter-broker traffic
> > > using
> > > > > > this
> > > > > > > quota as well similar to KIP-73?
> > > > > > >
> > > > > > > Thanks,
> > > > > > > Dong
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > > rajinisivaram@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > >> Hi all,
> > > > > > >>
> > > > > > >> I have just created KIP-124 to introduce request rate quotas
> to
> > > > Kafka:
> > > > > > >>
> > > > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> > > > > > >> Request+rate+quotas
> > > > > > >>
> > > > > > >> The proposal is for a simple percentage request handling time
> > > quota
> > > > > that
> > > > > > >> can be allocated to *<client-id>*, *<user>* or *<user,
> > > client-id>*.
> > > > > > There
> > > > > > >> are a few other suggestions also under "Rejected
> alternatives".
> > > > > Feedback
> > > > > > >> and suggestions are welcome.
> > > > > > >>
> > > > > > >> Thank you...
> > > > > > >>
> > > > > > >> Regards,
> > > > > > >>
> > > > > > >> Rajini
> > > > > > >>
> > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Dong Lin <li...@gmail.com>.
Hey Rajini,

Thanks for the explanation. I have some follow up questions regarding the
types of requests that will be covered by this quota. Since this KIP focus
only on throttling the traffic between client and broker and client never
sends LeaderAndIsrRequest to broker, should we exclude LeaderAndIsrRequest
from this KIP?

Besides, I am still not sure we should throttle MetadataUpdateRequeset. The
benefits of throttling MetadataUpdateRequest seems little since it doesn't
increase with user traffic. Client only sends MetadataUpdateRequest when
there is partition leadership change or when client metadata has expired.
On the other hand, if we throttle MetadataUpdateRequest, there is chance
that MetadataUpdateRequest doesn't get update in time and user may receive
exception. This seems like a big interface change because user will have to
change application code to handle such exception. Note that the current
rate-based quota will reduce traffic without throwing any exception to user.

Anyway, I am looking forward to the updated KIP:)

Thanks,
Dong

On Mon, Feb 20, 2017 at 2:43 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Dong, Onur & Becket,
>
> Thank you all for the very useful feedback.
>
> The choice of request handling time as opposed to request rate was based on
> the observation in KAFKA-4195
> <https://issues.apache.org/jira/browse/KAFKA-4195> that request rates may
> be less intuitive to configure than percentage utilization. But since the
> KIP is measuring time rather than request pool utilization as suggested in
> the JIRA, I agree that request rate would probably work better than
> percentage. So I am inclined to change the KIP to throttle on request rates
> (e.g 100 requests per second) rather than percentage. Average request rates
> are exposed as metrics, so admin can configure quotas based on that. And
> the values are more meaningful from the client application point of view. I
> am still interested in feedback regarding the second rejected alternative
> that throttles based on percentage utilization of resource handler pool.
> That was the suggestion from Jun/Ismael in KAFKA-4195, but I couldn't see
> how that would help in the case where a small number of connections pushed
> a continuous stream of short requests. Suggestions welcome.
>
> Responses to other questions above:
>
> - (Dong): The KIP proposes to throttle most requests (and not just
> Produce/Fetch) since the goal is to control usage of broker resources. So
> LeaderAndIsrRequest and MetadataRequest will also be throttled. The few
> requests not being throttled are timing-sensitive.
>
> - (Dong): The KIP does not propose to throttle inter-broker traffic based
> on request rates. The most frequent requests in inter-broker traffic are
> fetch requests and a well configured broker would use reasonably good
> values of min.bytes and max.wait that avoids overloading the broker
> unnecessarily with fetch requests. The existing byte-rate based quotas
> should be sufficient in this case.
>
> - (Onur): Quota window configuration - this is the existing configuration
> quota.window.size.seconds (also used for byte-rate quotas)
>
> - (Becket): The main issue that the KIP is addressing is clients flooding
> the broker with small requests (eg. fetch with max.wait.ms=0), which can
> overload the broker and delay requests from other clients/users even though
> the byte rate is quite small. CPU quota reflects the resource usage on the
> broker that the KIP is attempting to limit. Since this is the time on the
> local broker, it shouldn't vary much depending on acks=-1 etc. but I do
> agree on the unpredictability of time based quotas. Switching from request
> processing time to request rates will address this. Would you still be
> concerned that "*Users do not have direct control over the request rate,
> i.e. users do **not know when a request will be sent by the clients*"?
>
> Jun/Ismael,
>
> I am interested in your views on request rate based quotas and whether we
> should still consider utilization of the resource handler pool.
>
>
> Many thanks,
>
> Rajini
>
>
> On Sun, Feb 19, 2017 at 11:54 PM, Becket Qin <be...@gmail.com> wrote:
>
> > Thanks for the KIP, Rajini,
> >
> > If I understand correctly the proposal was essentially trying to quota
> the
> > CPU usage (that is probably why time slice is used instead of request
> rate)
> > while the existing quota we have is for network bandwidth.
> >
> > Given we are trying to throttle both CPU and Network, that implies the
> > following patterns for the clients:
> > 1. High CPU usage, high network usage.
> > 2. High CPU usage, low network usage.
> > 3. Low CPU usage, high network usage.
> > 4. Low CPU usage, low network usage
> >
> > Theoretically the existing quota addresses case 3 & 4. And this KIP seems
> > trying to address case 1 & 2. However, it might be helpful to understand
> > what we want to achieve with CPU and network quotas.
> >
> > People mainly use quota for two different purposes:
> > a) protecting the broker from misbehaving clients, and
> > b) resource distribution for multi-tenancy.
> >
> > I agree that generally speaking CPU time is a suitable metric to quota on
> > for CPU usage and would work for a). However, as Dong and Onur noticed,
> it
> > is not easy to quantify the impact for the end users at application level
> > with a throttled CPU time. If the purpose of the CPU quota is only for
> > protection, maybe we don't need a user facing CPU quota.
> >
> > That said, a user facing CPU quota could be useful for virtualization,
> > which maybe related to multi-tenancy but is a little different. Imagine
> > there are 10 services sharing the same physical Kafka cluster. With CPU
> > time quota and network bandwidth quota, each service can provision a
> > logical Kafka cluster with some reserved CPU time and network bandwidth.
> > And in this case the quota will be on per logic cluster. Not sure if this
> > is what the KIP is intended in the future, though. It would be good if
> the
> > KIP can be more clear on what exact scenarios the CPU quota is trying to
> > address.
> >
> > As of the request rate quota, while it seems easy to enforce and
> intuitive,
> > there are some caveats.
> > 1. Users do not have direct control over the request rate, i.e. users do
> > not known when a request will be sent by the clients.
> > 2. Each request may require different amount of CPU resources to handle.
> > That may depends on many things, e.g. whether acks = 1 or acks = -1,
> > whether a request is addressing 1000 partitions or 1 partition, whether a
> > fetch request requires message format down conversion or not, etc.
> > So the result of using request rate quota could be quite unpredictable.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Sat, Feb 18, 2017 at 9:35 PM, Dong Lin <li...@gmail.com> wrote:
> >
> > > I realized the main concern with this proposal is how user can
> interpret
> > > this CPU-percentage based quota. Since this quota is exposed to user,
> we
> > > need to explain to user how this quota is going to impact their
> > application
> > > performance and convince them that the quota is now too low for their
> > > application. We can able to do this with byte-rate based quota. But I
> am
> > > not sure how we can do this with CPU-percentage based quota. For
> example,
> > > how is user going to understand whether 1% CPU is OK?
> > >
> > > On Fri, Feb 17, 2017 at 10:11 AM, Onur Karaman <
> > > onurkaraman.apache@gmail.com
> > > > wrote:
> > >
> > > > Overall a big fan of the KIP.
> > > >
> > > > I'd have to agree with Dong. I'm not sure about the decision of using
> > the
> > > > percentage over the window as opposed to request rate. It's pretty
> hard
> > > to
> > > > reason about. I just spoke to one of our SRE's and he agrees.
> > > >
> > > > Also I may have missed it, but I couldn't find information in the KIP
> > on
> > > > where this window would be configured.
> > > >
> > > > On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin <li...@gmail.com>
> wrote:
> > > >
> > > > > To correct the typo above: It seems to me that determination of
> > request
> > > > > rate is not any more difficult than determination of *byte* rate as
> > > both
> > > > > metrics are commonly used to measure performance and provide
> > guarantee
> > > to
> > > > > user.
> > > > >
> > > > > On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <li...@gmail.com>
> > wrote:
> > > > >
> > > > > > Hey Rajini,
> > > > > >
> > > > > > Thanks for the KIP. I have some questions:
> > > > > >
> > > > > > - I am wondering why throttling based on request rate is listed
> as
> > a
> > > > > > rejected alternative. Can you provide more specific reason why it
> > is
> > > > > > difficult for administrators to decide request rates to allocate?
> > It
> > > > > seems
> > > > > > to me that determination of request rate is not any more
> difficult
> > > than
> > > > > > determination of request rate as both metrics are commonly used
> to
> > > > > measure
> > > > > > performance and provide guarantee to user. On the other hand, the
> > > > > > percentage of processing time provides a vague guarantee to user.
> > For
> > > > > > example, what performance can user expect if you provide 1%
> > > processing
> > > > > time
> > > > > > quota to this user? How is administrator going to decide this
> > quota?
> > > > > Should
> > > > > > Kafka administrator continues to reduce this percentage quota as
> > > number
> > > > > of
> > > > > > users grow?
> > > > > >
> > > > > > - The KIP suggests that LeaderAndIsrRequest and MetadataRequest
> > will
> > > > also
> > > > > > be throttled by this quota. What is the motivation for throttling
> > > these
> > > > > > requests? It is also inconsistent with rate-based quota which is
> > only
> > > > > > applied to ProduceRequest and FetchRequest. IMO it will be
> simpler
> > to
> > > > > only
> > > > > > throttle ProduceRequest and FetchRequest.
> > > > > >
> > > > > > - Do you think we should also throttle the inter-broker traffic
> > using
> > > > > this
> > > > > > quota as well similar to KIP-73?
> > > > > >
> > > > > > Thanks,
> > > > > > Dong
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > >> Hi all,
> > > > > >>
> > > > > >> I have just created KIP-124 to introduce request rate quotas to
> > > Kafka:
> > > > > >>
> > > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> > > > > >> Request+rate+quotas
> > > > > >>
> > > > > >> The proposal is for a simple percentage request handling time
> > quota
> > > > that
> > > > > >> can be allocated to *<client-id>*, *<user>* or *<user,
> > client-id>*.
> > > > > There
> > > > > >> are a few other suggestions also under "Rejected alternatives".
> > > > Feedback
> > > > > >> and suggestions are welcome.
> > > > > >>
> > > > > >> Thank you...
> > > > > >>
> > > > > >> Regards,
> > > > > >>
> > > > > >> Rajini
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Dong, Onur & Becket,

Thank you all for the very useful feedback.

The choice of request handling time as opposed to request rate was based on
the observation in KAFKA-4195
<https://issues.apache.org/jira/browse/KAFKA-4195> that request rates may
be less intuitive to configure than percentage utilization. But since the
KIP is measuring time rather than request pool utilization as suggested in
the JIRA, I agree that request rate would probably work better than
percentage. So I am inclined to change the KIP to throttle on request rates
(e.g 100 requests per second) rather than percentage. Average request rates
are exposed as metrics, so admin can configure quotas based on that. And
the values are more meaningful from the client application point of view. I
am still interested in feedback regarding the second rejected alternative
that throttles based on percentage utilization of resource handler pool.
That was the suggestion from Jun/Ismael in KAFKA-4195, but I couldn't see
how that would help in the case where a small number of connections pushed
a continuous stream of short requests. Suggestions welcome.

Responses to other questions above:

- (Dong): The KIP proposes to throttle most requests (and not just
Produce/Fetch) since the goal is to control usage of broker resources. So
LeaderAndIsrRequest and MetadataRequest will also be throttled. The few
requests not being throttled are timing-sensitive.

- (Dong): The KIP does not propose to throttle inter-broker traffic based
on request rates. The most frequent requests in inter-broker traffic are
fetch requests and a well configured broker would use reasonably good
values of min.bytes and max.wait that avoids overloading the broker
unnecessarily with fetch requests. The existing byte-rate based quotas
should be sufficient in this case.

- (Onur): Quota window configuration - this is the existing configuration
quota.window.size.seconds (also used for byte-rate quotas)

- (Becket): The main issue that the KIP is addressing is clients flooding
the broker with small requests (eg. fetch with max.wait.ms=0), which can
overload the broker and delay requests from other clients/users even though
the byte rate is quite small. CPU quota reflects the resource usage on the
broker that the KIP is attempting to limit. Since this is the time on the
local broker, it shouldn't vary much depending on acks=-1 etc. but I do
agree on the unpredictability of time based quotas. Switching from request
processing time to request rates will address this. Would you still be
concerned that "*Users do not have direct control over the request rate,
i.e. users do **not know when a request will be sent by the clients*"?

Jun/Ismael,

I am interested in your views on request rate based quotas and whether we
should still consider utilization of the resource handler pool.


Many thanks,

Rajini


On Sun, Feb 19, 2017 at 11:54 PM, Becket Qin <be...@gmail.com> wrote:

> Thanks for the KIP, Rajini,
>
> If I understand correctly the proposal was essentially trying to quota the
> CPU usage (that is probably why time slice is used instead of request rate)
> while the existing quota we have is for network bandwidth.
>
> Given we are trying to throttle both CPU and Network, that implies the
> following patterns for the clients:
> 1. High CPU usage, high network usage.
> 2. High CPU usage, low network usage.
> 3. Low CPU usage, high network usage.
> 4. Low CPU usage, low network usage
>
> Theoretically the existing quota addresses case 3 & 4. And this KIP seems
> trying to address case 1 & 2. However, it might be helpful to understand
> what we want to achieve with CPU and network quotas.
>
> People mainly use quota for two different purposes:
> a) protecting the broker from misbehaving clients, and
> b) resource distribution for multi-tenancy.
>
> I agree that generally speaking CPU time is a suitable metric to quota on
> for CPU usage and would work for a). However, as Dong and Onur noticed, it
> is not easy to quantify the impact for the end users at application level
> with a throttled CPU time. If the purpose of the CPU quota is only for
> protection, maybe we don't need a user facing CPU quota.
>
> That said, a user facing CPU quota could be useful for virtualization,
> which maybe related to multi-tenancy but is a little different. Imagine
> there are 10 services sharing the same physical Kafka cluster. With CPU
> time quota and network bandwidth quota, each service can provision a
> logical Kafka cluster with some reserved CPU time and network bandwidth.
> And in this case the quota will be on per logic cluster. Not sure if this
> is what the KIP is intended in the future, though. It would be good if the
> KIP can be more clear on what exact scenarios the CPU quota is trying to
> address.
>
> As of the request rate quota, while it seems easy to enforce and intuitive,
> there are some caveats.
> 1. Users do not have direct control over the request rate, i.e. users do
> not known when a request will be sent by the clients.
> 2. Each request may require different amount of CPU resources to handle.
> That may depends on many things, e.g. whether acks = 1 or acks = -1,
> whether a request is addressing 1000 partitions or 1 partition, whether a
> fetch request requires message format down conversion or not, etc.
> So the result of using request rate quota could be quite unpredictable.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Sat, Feb 18, 2017 at 9:35 PM, Dong Lin <li...@gmail.com> wrote:
>
> > I realized the main concern with this proposal is how user can interpret
> > this CPU-percentage based quota. Since this quota is exposed to user, we
> > need to explain to user how this quota is going to impact their
> application
> > performance and convince them that the quota is now too low for their
> > application. We can able to do this with byte-rate based quota. But I am
> > not sure how we can do this with CPU-percentage based quota. For example,
> > how is user going to understand whether 1% CPU is OK?
> >
> > On Fri, Feb 17, 2017 at 10:11 AM, Onur Karaman <
> > onurkaraman.apache@gmail.com
> > > wrote:
> >
> > > Overall a big fan of the KIP.
> > >
> > > I'd have to agree with Dong. I'm not sure about the decision of using
> the
> > > percentage over the window as opposed to request rate. It's pretty hard
> > to
> > > reason about. I just spoke to one of our SRE's and he agrees.
> > >
> > > Also I may have missed it, but I couldn't find information in the KIP
> on
> > > where this window would be configured.
> > >
> > > On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin <li...@gmail.com> wrote:
> > >
> > > > To correct the typo above: It seems to me that determination of
> request
> > > > rate is not any more difficult than determination of *byte* rate as
> > both
> > > > metrics are commonly used to measure performance and provide
> guarantee
> > to
> > > > user.
> > > >
> > > > On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <li...@gmail.com>
> wrote:
> > > >
> > > > > Hey Rajini,
> > > > >
> > > > > Thanks for the KIP. I have some questions:
> > > > >
> > > > > - I am wondering why throttling based on request rate is listed as
> a
> > > > > rejected alternative. Can you provide more specific reason why it
> is
> > > > > difficult for administrators to decide request rates to allocate?
> It
> > > > seems
> > > > > to me that determination of request rate is not any more difficult
> > than
> > > > > determination of request rate as both metrics are commonly used to
> > > > measure
> > > > > performance and provide guarantee to user. On the other hand, the
> > > > > percentage of processing time provides a vague guarantee to user.
> For
> > > > > example, what performance can user expect if you provide 1%
> > processing
> > > > time
> > > > > quota to this user? How is administrator going to decide this
> quota?
> > > > Should
> > > > > Kafka administrator continues to reduce this percentage quota as
> > number
> > > > of
> > > > > users grow?
> > > > >
> > > > > - The KIP suggests that LeaderAndIsrRequest and MetadataRequest
> will
> > > also
> > > > > be throttled by this quota. What is the motivation for throttling
> > these
> > > > > requests? It is also inconsistent with rate-based quota which is
> only
> > > > > applied to ProduceRequest and FetchRequest. IMO it will be simpler
> to
> > > > only
> > > > > throttle ProduceRequest and FetchRequest.
> > > > >
> > > > > - Do you think we should also throttle the inter-broker traffic
> using
> > > > this
> > > > > quota as well similar to KIP-73?
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > rajinisivaram@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > >> Hi all,
> > > > >>
> > > > >> I have just created KIP-124 to introduce request rate quotas to
> > Kafka:
> > > > >>
> > > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> > > > >> Request+rate+quotas
> > > > >>
> > > > >> The proposal is for a simple percentage request handling time
> quota
> > > that
> > > > >> can be allocated to *<client-id>*, *<user>* or *<user,
> client-id>*.
> > > > There
> > > > >> are a few other suggestions also under "Rejected alternatives".
> > > Feedback
> > > > >> and suggestions are welcome.
> > > > >>
> > > > >> Thank you...
> > > > >>
> > > > >> Regards,
> > > > >>
> > > > >> Rajini
> > > > >>
> > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Becket Qin <be...@gmail.com>.
Thanks for the KIP, Rajini,

If I understand correctly the proposal was essentially trying to quota the
CPU usage (that is probably why time slice is used instead of request rate)
while the existing quota we have is for network bandwidth.

Given we are trying to throttle both CPU and Network, that implies the
following patterns for the clients:
1. High CPU usage, high network usage.
2. High CPU usage, low network usage.
3. Low CPU usage, high network usage.
4. Low CPU usage, low network usage

Theoretically the existing quota addresses case 3 & 4. And this KIP seems
trying to address case 1 & 2. However, it might be helpful to understand
what we want to achieve with CPU and network quotas.

People mainly use quota for two different purposes:
a) protecting the broker from misbehaving clients, and
b) resource distribution for multi-tenancy.

I agree that generally speaking CPU time is a suitable metric to quota on
for CPU usage and would work for a). However, as Dong and Onur noticed, it
is not easy to quantify the impact for the end users at application level
with a throttled CPU time. If the purpose of the CPU quota is only for
protection, maybe we don't need a user facing CPU quota.

That said, a user facing CPU quota could be useful for virtualization,
which maybe related to multi-tenancy but is a little different. Imagine
there are 10 services sharing the same physical Kafka cluster. With CPU
time quota and network bandwidth quota, each service can provision a
logical Kafka cluster with some reserved CPU time and network bandwidth.
And in this case the quota will be on per logic cluster. Not sure if this
is what the KIP is intended in the future, though. It would be good if the
KIP can be more clear on what exact scenarios the CPU quota is trying to
address.

As of the request rate quota, while it seems easy to enforce and intuitive,
there are some caveats.
1. Users do not have direct control over the request rate, i.e. users do
not known when a request will be sent by the clients.
2. Each request may require different amount of CPU resources to handle.
That may depends on many things, e.g. whether acks = 1 or acks = -1,
whether a request is addressing 1000 partitions or 1 partition, whether a
fetch request requires message format down conversion or not, etc.
So the result of using request rate quota could be quite unpredictable.

Thanks,

Jiangjie (Becket) Qin

On Sat, Feb 18, 2017 at 9:35 PM, Dong Lin <li...@gmail.com> wrote:

> I realized the main concern with this proposal is how user can interpret
> this CPU-percentage based quota. Since this quota is exposed to user, we
> need to explain to user how this quota is going to impact their application
> performance and convince them that the quota is now too low for their
> application. We can able to do this with byte-rate based quota. But I am
> not sure how we can do this with CPU-percentage based quota. For example,
> how is user going to understand whether 1% CPU is OK?
>
> On Fri, Feb 17, 2017 at 10:11 AM, Onur Karaman <
> onurkaraman.apache@gmail.com
> > wrote:
>
> > Overall a big fan of the KIP.
> >
> > I'd have to agree with Dong. I'm not sure about the decision of using the
> > percentage over the window as opposed to request rate. It's pretty hard
> to
> > reason about. I just spoke to one of our SRE's and he agrees.
> >
> > Also I may have missed it, but I couldn't find information in the KIP on
> > where this window would be configured.
> >
> > On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin <li...@gmail.com> wrote:
> >
> > > To correct the typo above: It seems to me that determination of request
> > > rate is not any more difficult than determination of *byte* rate as
> both
> > > metrics are commonly used to measure performance and provide guarantee
> to
> > > user.
> > >
> > > On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <li...@gmail.com> wrote:
> > >
> > > > Hey Rajini,
> > > >
> > > > Thanks for the KIP. I have some questions:
> > > >
> > > > - I am wondering why throttling based on request rate is listed as a
> > > > rejected alternative. Can you provide more specific reason why it is
> > > > difficult for administrators to decide request rates to allocate? It
> > > seems
> > > > to me that determination of request rate is not any more difficult
> than
> > > > determination of request rate as both metrics are commonly used to
> > > measure
> > > > performance and provide guarantee to user. On the other hand, the
> > > > percentage of processing time provides a vague guarantee to user. For
> > > > example, what performance can user expect if you provide 1%
> processing
> > > time
> > > > quota to this user? How is administrator going to decide this quota?
> > > Should
> > > > Kafka administrator continues to reduce this percentage quota as
> number
> > > of
> > > > users grow?
> > > >
> > > > - The KIP suggests that LeaderAndIsrRequest and MetadataRequest will
> > also
> > > > be throttled by this quota. What is the motivation for throttling
> these
> > > > requests? It is also inconsistent with rate-based quota which is only
> > > > applied to ProduceRequest and FetchRequest. IMO it will be simpler to
> > > only
> > > > throttle ProduceRequest and FetchRequest.
> > > >
> > > > - Do you think we should also throttle the inter-broker traffic using
> > > this
> > > > quota as well similar to KIP-73?
> > > >
> > > > Thanks,
> > > > Dong
> > > >
> > > >
> > > >
> > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > rajinisivaram@gmail.com
> > > >
> > > > wrote:
> > > >
> > > >> Hi all,
> > > >>
> > > >> I have just created KIP-124 to introduce request rate quotas to
> Kafka:
> > > >>
> > > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> > > >> Request+rate+quotas
> > > >>
> > > >> The proposal is for a simple percentage request handling time quota
> > that
> > > >> can be allocated to *<client-id>*, *<user>* or *<user, client-id>*.
> > > There
> > > >> are a few other suggestions also under "Rejected alternatives".
> > Feedback
> > > >> and suggestions are welcome.
> > > >>
> > > >> Thank you...
> > > >>
> > > >> Regards,
> > > >>
> > > >> Rajini
> > > >>
> > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Dong Lin <li...@gmail.com>.
I realized the main concern with this proposal is how user can interpret
this CPU-percentage based quota. Since this quota is exposed to user, we
need to explain to user how this quota is going to impact their application
performance and convince them that the quota is now too low for their
application. We can able to do this with byte-rate based quota. But I am
not sure how we can do this with CPU-percentage based quota. For example,
how is user going to understand whether 1% CPU is OK?

On Fri, Feb 17, 2017 at 10:11 AM, Onur Karaman <onurkaraman.apache@gmail.com
> wrote:

> Overall a big fan of the KIP.
>
> I'd have to agree with Dong. I'm not sure about the decision of using the
> percentage over the window as opposed to request rate. It's pretty hard to
> reason about. I just spoke to one of our SRE's and he agrees.
>
> Also I may have missed it, but I couldn't find information in the KIP on
> where this window would be configured.
>
> On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin <li...@gmail.com> wrote:
>
> > To correct the typo above: It seems to me that determination of request
> > rate is not any more difficult than determination of *byte* rate as both
> > metrics are commonly used to measure performance and provide guarantee to
> > user.
> >
> > On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <li...@gmail.com> wrote:
> >
> > > Hey Rajini,
> > >
> > > Thanks for the KIP. I have some questions:
> > >
> > > - I am wondering why throttling based on request rate is listed as a
> > > rejected alternative. Can you provide more specific reason why it is
> > > difficult for administrators to decide request rates to allocate? It
> > seems
> > > to me that determination of request rate is not any more difficult than
> > > determination of request rate as both metrics are commonly used to
> > measure
> > > performance and provide guarantee to user. On the other hand, the
> > > percentage of processing time provides a vague guarantee to user. For
> > > example, what performance can user expect if you provide 1% processing
> > time
> > > quota to this user? How is administrator going to decide this quota?
> > Should
> > > Kafka administrator continues to reduce this percentage quota as number
> > of
> > > users grow?
> > >
> > > - The KIP suggests that LeaderAndIsrRequest and MetadataRequest will
> also
> > > be throttled by this quota. What is the motivation for throttling these
> > > requests? It is also inconsistent with rate-based quota which is only
> > > applied to ProduceRequest and FetchRequest. IMO it will be simpler to
> > only
> > > throttle ProduceRequest and FetchRequest.
> > >
> > > - Do you think we should also throttle the inter-broker traffic using
> > this
> > > quota as well similar to KIP-73?
> > >
> > > Thanks,
> > > Dong
> > >
> > >
> > >
> > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> rajinisivaram@gmail.com
> > >
> > > wrote:
> > >
> > >> Hi all,
> > >>
> > >> I have just created KIP-124 to introduce request rate quotas to Kafka:
> > >>
> > >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> > >> Request+rate+quotas
> > >>
> > >> The proposal is for a simple percentage request handling time quota
> that
> > >> can be allocated to *<client-id>*, *<user>* or *<user, client-id>*.
> > There
> > >> are a few other suggestions also under "Rejected alternatives".
> Feedback
> > >> and suggestions are welcome.
> > >>
> > >> Thank you...
> > >>
> > >> Regards,
> > >>
> > >> Rajini
> > >>
> > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Onur Karaman <on...@gmail.com>.
Overall a big fan of the KIP.

I'd have to agree with Dong. I'm not sure about the decision of using the
percentage over the window as opposed to request rate. It's pretty hard to
reason about. I just spoke to one of our SRE's and he agrees.

Also I may have missed it, but I couldn't find information in the KIP on
where this window would be configured.

On Fri, Feb 17, 2017 at 9:45 AM, Dong Lin <li...@gmail.com> wrote:

> To correct the typo above: It seems to me that determination of request
> rate is not any more difficult than determination of *byte* rate as both
> metrics are commonly used to measure performance and provide guarantee to
> user.
>
> On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <li...@gmail.com> wrote:
>
> > Hey Rajini,
> >
> > Thanks for the KIP. I have some questions:
> >
> > - I am wondering why throttling based on request rate is listed as a
> > rejected alternative. Can you provide more specific reason why it is
> > difficult for administrators to decide request rates to allocate? It
> seems
> > to me that determination of request rate is not any more difficult than
> > determination of request rate as both metrics are commonly used to
> measure
> > performance and provide guarantee to user. On the other hand, the
> > percentage of processing time provides a vague guarantee to user. For
> > example, what performance can user expect if you provide 1% processing
> time
> > quota to this user? How is administrator going to decide this quota?
> Should
> > Kafka administrator continues to reduce this percentage quota as number
> of
> > users grow?
> >
> > - The KIP suggests that LeaderAndIsrRequest and MetadataRequest will also
> > be throttled by this quota. What is the motivation for throttling these
> > requests? It is also inconsistent with rate-based quota which is only
> > applied to ProduceRequest and FetchRequest. IMO it will be simpler to
> only
> > throttle ProduceRequest and FetchRequest.
> >
> > - Do you think we should also throttle the inter-broker traffic using
> this
> > quota as well similar to KIP-73?
> >
> > Thanks,
> > Dong
> >
> >
> >
> > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > wrote:
> >
> >> Hi all,
> >>
> >> I have just created KIP-124 to introduce request rate quotas to Kafka:
> >>
> >> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> >> Request+rate+quotas
> >>
> >> The proposal is for a simple percentage request handling time quota that
> >> can be allocated to *<client-id>*, *<user>* or *<user, client-id>*.
> There
> >> are a few other suggestions also under "Rejected alternatives". Feedback
> >> and suggestions are welcome.
> >>
> >> Thank you...
> >>
> >> Regards,
> >>
> >> Rajini
> >>
> >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Dong Lin <li...@gmail.com>.
To correct the typo above: It seems to me that determination of request
rate is not any more difficult than determination of *byte* rate as both
metrics are commonly used to measure performance and provide guarantee to
user.

On Fri, Feb 17, 2017 at 9:40 AM, Dong Lin <li...@gmail.com> wrote:

> Hey Rajini,
>
> Thanks for the KIP. I have some questions:
>
> - I am wondering why throttling based on request rate is listed as a
> rejected alternative. Can you provide more specific reason why it is
> difficult for administrators to decide request rates to allocate? It seems
> to me that determination of request rate is not any more difficult than
> determination of request rate as both metrics are commonly used to measure
> performance and provide guarantee to user. On the other hand, the
> percentage of processing time provides a vague guarantee to user. For
> example, what performance can user expect if you provide 1% processing time
> quota to this user? How is administrator going to decide this quota? Should
> Kafka administrator continues to reduce this percentage quota as number of
> users grow?
>
> - The KIP suggests that LeaderAndIsrRequest and MetadataRequest will also
> be throttled by this quota. What is the motivation for throttling these
> requests? It is also inconsistent with rate-based quota which is only
> applied to ProduceRequest and FetchRequest. IMO it will be simpler to only
> throttle ProduceRequest and FetchRequest.
>
> - Do you think we should also throttle the inter-broker traffic using this
> quota as well similar to KIP-73?
>
> Thanks,
> Dong
>
>
>
> On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
>> Hi all,
>>
>> I have just created KIP-124 to introduce request rate quotas to Kafka:
>>
>> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
>> Request+rate+quotas
>>
>> The proposal is for a simple percentage request handling time quota that
>> can be allocated to *<client-id>*, *<user>* or *<user, client-id>*. There
>> are a few other suggestions also under "Rejected alternatives". Feedback
>> and suggestions are welcome.
>>
>> Thank you...
>>
>> Regards,
>>
>> Rajini
>>
>
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Dong Lin <li...@gmail.com>.
Hey Rajini,

Thanks for the KIP. I have some questions:

- I am wondering why throttling based on request rate is listed as a
rejected alternative. Can you provide more specific reason why it is
difficult for administrators to decide request rates to allocate? It seems
to me that determination of request rate is not any more difficult than
determination of request rate as both metrics are commonly used to measure
performance and provide guarantee to user. On the other hand, the
percentage of processing time provides a vague guarantee to user. For
example, what performance can user expect if you provide 1% processing time
quota to this user? How is administrator going to decide this quota? Should
Kafka administrator continues to reduce this percentage quota as number of
users grow?

- The KIP suggests that LeaderAndIsrRequest and MetadataRequest will also
be throttled by this quota. What is the motivation for throttling these
requests? It is also inconsistent with rate-based quota which is only
applied to ProduceRequest and FetchRequest. IMO it will be simpler to only
throttle ProduceRequest and FetchRequest.

- Do you think we should also throttle the inter-broker traffic using this
quota as well similar to KIP-73?

Thanks,
Dong



On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Hi all,
>
> I have just created KIP-124 to introduce request rate quotas to Kafka:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-124+-+
> Request+rate+quotas
>
> The proposal is for a simple percentage request handling time quota that
> can be allocated to *<client-id>*, *<user>* or *<user, client-id>*. There
> are a few other suggestions also under "Rejected alternatives". Feedback
> and suggestions are welcome.
>
> Thank you...
>
> Regards,
>
> Rajini
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Thanks, Jay.

*(1) *The rename from *request.time*.percent to* io.thread*.units for the
quota configuration was based on the change from percent to thread-units,
since we will need different quota configuration for I/O threads and
network threads if we use units. If we agree that *(2)* percent (or ratio)
is a better configuration, then the name can be request,time.percent, with
the same config applying to both request thread utilization and network
thread utilization. Metrics and sensors on the brokers-side will probably
need to be separate for I/O and network threads so that these can be
accounted separately (5% request.time.percent would mean maximum 5% of
request thread utilization and maximum 5% of network thread utilization
with either violation leading to throttling).

*(3)* Agree - KIP reflects combined throttling time in a single field in
the response.


On Fri, Feb 24, 2017 at 4:10 PM, Jay Kreps <ja...@confluent.io> wrote:

> A couple of quick points:
>
> 1. Even though the implementation of this quota is only using io thread
> time, i think we should call it something like "request-time". This will
> give us flexibility to improve the implementation to cover network threads
> in the future and will avoid exposing internal details like our thread
> pools on the server.
>
> 2. Jun/Roger, I get what you are trying to fix but the idea of thread/units
> is super unintuitive as a user-facing knob. I had to read the KIP like
> eight times to understand this. I'm not sure that your point that
> increasing the number of threads is a problem with a percentage-based
> value, it really depends on whether the user thinks about the "percentage
> of request processing time" or "thread units". If they think "I have
> allocated 10% of my request processing time to user x" then it is a bug
> that increasing the thread count decreases that percent as it does in the
> current proposal. As a practical matter I think the only way to actually
> reason about this is as a percent---I just don't believe people are going
> to think, "ah, 4.3 thread units, that is the right amount!". Instead I
> think they have to understand this thread unit concept, figure out what
> they have set in number of threads, compute a percent and then come up with
> the number of thread units, and these will all be wrong if that thread
> count changes. I also think this ties us to throttling the I/O thread pool,
> which may not be where we want to end up.
>
> 3. For what it's worth I do think having a single throttle_ms field in all
> the responses that combines all throttling from all quotas is probably the
> simplest. There could be a use case for having separate fields for each,
> but I think that is actually harder to use/monitor in the common case so
> unless someone has a use case I think just one should be fine.
>
> -Jay
>
> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > I have updated the KIP based on the discussions so far.
> >
> >
> > Regards,
> >
> > Rajini
> >
> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> rajinisivaram@gmail.com>
> > wrote:
> >
> > > Thank you all for the feedback.
> > >
> > > Ismael #1. It makes sense not to throttle inter-broker requests like
> > > LeaderAndIsr etc. The simplest way to ensure that clients cannot use
> > these
> > > requests to bypass quotas for DoS attacks is to ensure that ACLs
> prevent
> > > clients from using these requests and unauthorized requests are
> included
> > > towards quotas.
> > >
> > > Ismael #2, Jay #1 : I was thinking that these quotas can return a
> > separate
> > > throttle time, and all utilization based quotas could use the same
> field
> > > (we won't add another one for network thread utilization for instance).
> > But
> > > perhaps it makes sense to keep byte rate quotas separate in
> produce/fetch
> > > responses to provide separate metrics? Agree with Ismael that the name
> of
> > > the existing field should be changed if we have two. Happy to switch
> to a
> > > single combined throttle time if that is sufficient.
> > >
> > > Ismael #4, #5, #6: Will update KIP. Will use dot separated name for new
> > > property. Replication quotas use dot separated, so it will be
> consistent
> > > with all properties except byte rate quotas.
> > >
> > > Radai: #1 Request processing time rather than request rate were chosen
> > > because the time per request can vary significantly between requests as
> > > mentioned in the discussion and KIP.
> > > #2 Two separate quotas for heartbeats/regular requests feel like more
> > > configuration and more metrics. Since most users would set quotas
> higher
> > > than the expected usage and quotas are more of a safety net, a single
> > quota
> > > should work in most cases.
> > >  #3 The number of requests in purgatory is limited by the number of
> > active
> > > connections since only one request per connection will be throttled at
> a
> > > time.
> > > #4 As with byte rate quotas, to use the full allocated quotas,
> > > clients/users would need to use partitions that are distributed across
> > the
> > > cluster. The alternative of using cluster-wide quotas instead of
> > per-broker
> > > quotas would be far too complex to implement.
> > >
> > > Dong : We currently have two ClientQuotaManagers for quota types Fetch
> > and
> > > Produce. A new one will be added for IOThread, which manages quotas for
> > I/O
> > > thread utilization. This will not update the Fetch or Produce
> queue-size,
> > > but will have a separate metric for the queue-size.  I wasn't planning
> to
> > > add any additional metrics apart from the equivalent ones for existing
> > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
> utilization
> > > could be slightly misleading since it depends on the sequence of
> > requests.
> > > But we can look into more metrics after the KIP is implemented if
> > required.
> > >
> > > I think we need to limit the maximum delay since all requests are
> > > throttled. If a client has a quota of 0.001 units and a single request
> > used
> > > 50ms, we don't want to delay all requests from the client by 50
> seconds,
> > > throwing the client out of all its consumer groups. The issue is only
> if
> > a
> > > user is allocated a quota that is insufficient to process one large
> > > request. The expectation is that the units allocated per user will be
> > much
> > > higher than the time taken to process one request and the limit should
> > > seldom be applied. Agree this needs proper documentation.
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <ra...@gmail.com>
> > wrote:
> > >
> > >> @jun: i wasnt concerned about tying up a request processing thread,
> but
> > >> IIUC the code does still read the entire request out, which might
> add-up
> > >> to
> > >> a non-negligible amount of memory.
> > >>
> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com>
> wrote:
> > >>
> > >> > Hey Rajini,
> > >> >
> > >> > The current KIP says that the maximum delay will be reduced to
> window
> > >> size
> > >> > if it is larger than the window size. I have a concern with this:
> > >> >
> > >> > 1) This essentially means that the user is allowed to exceed their
> > quota
> > >> > over a long period of time. Can you provide an upper bound on this
> > >> > deviation?
> > >> >
> > >> > 2) What is the motivation for cap the maximum delay by the window
> > size?
> > >> I
> > >> > am wondering if there is better alternative to address the problem.
> > >> >
> > >> > 3) It means that the existing metric-related config will have a more
> > >> > directly impact on the mechanism of this io-thread-unit-based quota.
> > The
> > >> > may be an important change depending on the answer to 1) above. We
> > >> probably
> > >> > need to document this more explicitly.
> > >> >
> > >> > Dong
> > >> >
> > >> >
> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <li...@gmail.com>
> > wrote:
> > >> >
> > >> > > Hey Jun,
> > >> > >
> > >> > > Yeah you are right. I thought it wasn't because at LinkedIn it
> will
> > be
> > >> > too
> > >> > > much pressure on inGraph to expose those per-clientId metrics so
> we
> > >> ended
> > >> > > up printing them periodically to local log. Never mind if it is
> not
> > a
> > >> > > general problem.
> > >> > >
> > >> > > Hey Rajini,
> > >> > >
> > >> > > - I agree with Jay that we probably don't want to add a new field
> > for
> > >> > > every quota ProduceResponse or FetchResponse. Is there any
> use-case
> > >> for
> > >> > > having separate throttle-time fields for byte-rate-quota and
> > >> > > io-thread-unit-quota? You probably need to document this as
> > interface
> > >> > > change if you plan to add new field in any request.
> > >> > >
> > >> > > - I don't think IOThread belongs to quotaType. The existing quota
> > >> types
> > >> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication)
> identify
> > >> the
> > >> > > type of request that are throttled, not the quota mechanism that
> is
> > >> > applied.
> > >> > >
> > >> > > - If a request is throttled due to this io-thread-unit-based
> quota,
> > is
> > >> > the
> > >> > > existing queue-size metric in ClientQuotaManager incremented?
> > >> > >
> > >> > > - In the interest of providing guide line for admin to decide
> > >> > > io-thread-unit-based quota and for user to understand its impact
> on
> > >> their
> > >> > > traffic, would it be useful to have a metric that shows the
> overall
> > >> > > byte-rate per io-thread-unit? Can we also show this a per-clientId
> > >> > metric?
> > >> > >
> > >> > > Thanks,
> > >> > > Dong
> > >> > >
> > >> > >
> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io>
> wrote:
> > >> > >
> > >> > >> Hi, Ismael,
> > >> > >>
> > >> > >> For #3, typically, an admin won't configure more io threads than
> > CPU
> > >> > >> cores,
> > >> > >> but it's possible for an admin to start with fewer io threads
> than
> > >> cores
> > >> > >> and grow that later on.
> > >> > >>
> > >> > >> Hi, Dong,
> > >> > >>
> > >> > >> I think the throttleTime sensor on the broker tells the admin
> > >> whether a
> > >> > >> user/clentId is throttled or not.
> > >> > >>
> > >> > >> Hi, Radi,
> > >> > >>
> > >> > >> The reasoning for delaying the throttled requests on the broker
> > >> instead
> > >> > of
> > >> > >> returning an error immediately is that the latter has no way to
> > >> prevent
> > >> > >> the
> > >> > >> client from retrying immediately, which will make things worse.
> The
> > >> > >> delaying logic is based off a delay queue. A separate expiration
> > >> thread
> > >> > >> just waits on the next to be expired request. So, it doesn't tie
> > up a
> > >> > >> request handler thread.
> > >> > >>
> > >> > >> Thanks,
> > >> > >>
> > >> > >> Jun
> > >> > >>
> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <is...@juma.me.uk>
> > >> wrote:
> > >> > >>
> > >> > >> > Hi Jay,
> > >> > >> >
> > >> > >> > Regarding 1, I definitely like the simplicity of keeping a
> single
> > >> > >> throttle
> > >> > >> > time field in the response. The downside is that the client
> > metrics
> > >> > >> will be
> > >> > >> > more coarse grained.
> > >> > >> >
> > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.percentage`
> > and
> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > >> > >> >
> > >> > >> > Ismael
> > >> > >> >
> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io>
> > >> wrote:
> > >> > >> >
> > >> > >> > > A few minor comments:
> > >> > >> > >
> > >> > >> > >    1. Isn't it the case that the throttling time response
> field
> > >> > should
> > >> > >> > have
> > >> > >> > >    the total time your request was throttled irrespective of
> > the
> > >> > >> quotas
> > >> > >> > > that
> > >> > >> > >    caused that. Limiting it to byte rate quota doesn't make
> > >> sense,
> > >> > >> but I
> > >> > >> > > also
> > >> > >> > >    I don't think we want to end up adding new fields in the
> > >> response
> > >> > >> for
> > >> > >> > > every
> > >> > >> > >    single thing we quota, right?
> > >> > >> > >    2. I don't think we should make this quota specifically
> > about
> > >> io
> > >> > >> > >    threads. Once we introduce these quotas people set them
> and
> > >> > expect
> > >> > >> > them
> > >> > >> > > to
> > >> > >> > >    be enforced (and if they aren't it may cause an outage).
> As
> > a
> > >> > >> result
> > >> > >> > > they
> > >> > >> > >    are a bit more sensitive than normal configs, I think. The
> > >> > current
> > >> > >> > > thread
> > >> > >> > >    pools seem like something of an implementation detail and
> > not
> > >> the
> > >> > >> > level
> > >> > >> > > the
> > >> > >> > >    user-facing quotas should be involved with. I think it
> might
> > >> be
> > >> > >> better
> > >> > >> > > to
> > >> > >> > >    make this a general request-time throttle with no mention
> in
> > >> the
> > >> > >> > naming
> > >> > >> > >    about I/O threads and simply acknowledge the current
> > >> limitation
> > >> > >> (which
> > >> > >> > > we
> > >> > >> > >    may someday fix) in the docs that this covers only the
> time
> > >> after
> > >> > >> the
> > >> > >> > >    thread is read off the network.
> > >> > >> > >    3. As such I think the right interface to the user would
> be
> > >> > >> something
> > >> > >> > >    like percent_request_time and be in {0,...100} or
> > >> > >> request_time_ratio
> > >> > >> > > and be
> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we
> used
> > >> if
> > >> > the
> > >> > >> > > scale
> > >> > >> > >    is between 0 and 1 in the other metrics, right?)
> > >> > >> > >
> > >> > >> > > -Jay
> > >> > >> > >
> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > >> > >> rajinisivaram@gmail.com
> > >> > >> > >
> > >> > >> > > wrote:
> > >> > >> > >
> > >> > >> > > > Guozhang/Dong,
> > >> > >> > > >
> > >> > >> > > > Thank you for the feedback.
> > >> > >> > > >
> > >> > >> > > > Guozhang : I have updated the section on co-existence of
> byte
> > >> rate
> > >> > >> and
> > >> > >> > > > request time quotas.
> > >> > >> > > >
> > >> > >> > > > Dong: I hadn't added much detail to the metrics and sensors
> > >> since
> > >> > >> they
> > >> > >> > > are
> > >> > >> > > > going to be very similar to the existing metrics and
> sensors.
> > >> To
> > >> > >> avoid
> > >> > >> > > > confusion, I have now added more detail. All metrics are in
> > the
> > >> > >> group
> > >> > >> > > > "quotaType" and all sensors have names starting with
> > >> "quotaType"
> > >> > >> (where
> > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > >> > >> > > > FollowerReplication/*IOThread*).
> > >> > >> > > > So there will be no reuse of existing metrics/sensors. The
> > new
> > >> > ones
> > >> > >> for
> > >> > >> > > > request processing time based throttling will be completely
> > >> > >> independent
> > >> > >> > > of
> > >> > >> > > > existing metrics/sensors, but will be consistent in format.
> > >> > >> > > >
> > >> > >> > > > The existing throttle_time_ms field in produce/fetch
> > responses
> > >> > will
> > >> > >> not
> > >> > >> > > be
> > >> > >> > > > impacted by this KIP. That will continue to return
> byte-rate
> > >> based
> > >> > >> > > > throttling times. In addition, a new field
> > >> > request_throttle_time_ms
> > >> > >> > will
> > >> > >> > > be
> > >> > >> > > > added to return request quota based throttling times. These
> > >> will
> > >> > be
> > >> > >> > > exposed
> > >> > >> > > > as new metrics on the client-side.
> > >> > >> > > >
> > >> > >> > > > Since all metrics and sensors are different for each type
> of
> > >> > quota,
> > >> > >> I
> > >> > >> > > > believe there is already sufficient metrics to monitor
> > >> throttling
> > >> > on
> > >> > >> > both
> > >> > >> > > > client and broker side for each type of throttling.
> > >> > >> > > >
> > >> > >> > > > Regards,
> > >> > >> > > >
> > >> > >> > > > Rajini
> > >> > >> > > >
> > >> > >> > > >
> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > lindong28@gmail.com
> > >> >
> > >> > >> wrote:
> > >> > >> > > >
> > >> > >> > > > > Hey Rajini,
> > >> > >> > > > >
> > >> > >> > > > > I think it makes a lot of sense to use io_thread_units as
> > >> metric
> > >> > >> to
> > >> > >> > > quota
> > >> > >> > > > > user's traffic here. LGTM overall. I have some questions
> > >> > regarding
> > >> > >> > > > sensors.
> > >> > >> > > > >
> > >> > >> > > > > - Can you be more specific in the KIP what sensors will
> be
> > >> > added?
> > >> > >> For
> > >> > >> > > > > example, it will be useful to specify the name and
> > >> attributes of
> > >> > >> > these
> > >> > >> > > > new
> > >> > >> > > > > sensors.
> > >> > >> > > > >
> > >> > >> > > > > - We currently have throttle-time and queue-size for
> > >> byte-rate
> > >> > >> based
> > >> > >> > > > quota.
> > >> > >> > > > > Are you going to have separate throttle-time and
> queue-size
> > >> for
> > >> > >> > > requests
> > >> > >> > > > > throttled by io_thread_unit-based quota, or will they
> share
> > >> the
> > >> > >> same
> > >> > >> > > > > sensor?
> > >> > >> > > > >
> > >> > >> > > > > - Does the throttle-time in the ProduceResponse and
> > >> > FetchResponse
> > >> > >> > > > contains
> > >> > >> > > > > time due to io_thread_unit-based quota?
> > >> > >> > > > >
> > >> > >> > > > > - Currently kafka server doesn't not provide any log or
> > >> metrics
> > >> > >> that
> > >> > >> > > > tells
> > >> > >> > > > > whether any given clientId (or user) is throttled. This
> is
> > >> not
> > >> > too
> > >> > >> > bad
> > >> > >> > > > > because we can still check the client-side byte-rate
> metric
> > >> to
> > >> > >> > validate
> > >> > >> > > > > whether a given client is throttled. But with this
> > >> > io_thread_unit,
> > >> > >> > > there
> > >> > >> > > > > will be no way to validate whether a given client is slow
> > >> > because
> > >> > >> it
> > >> > >> > > has
> > >> > >> > > > > exceeded its io_thread_unit limit. It is necessary for
> user
> > >> to
> > >> > be
> > >> > >> > able
> > >> > >> > > to
> > >> > >> > > > > know this information to figure how whether they have
> > reached
> > >> > >> there
> > >> > >> > > quota
> > >> > >> > > > > limit. How about we add log4j log on the server side to
> > >> > >> periodically
> > >> > >> > > > print
> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > >> > >> > io-thread-unit-throttle-time)
> > >> > >> > > so
> > >> > >> > > > > that kafka administrator can figure those users that have
> > >> > reached
> > >> > >> > their
> > >> > >> > > > > limit and act accordingly?
> > >> > >> > > > >
> > >> > >> > > > > Thanks,
> > >> > >> > > > > Dong
> > >> > >> > > > >
> > >> > >> > > > >
> > >> > >> > > > >
> > >> > >> > > > >
> > >> > >> > > > >
> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> > >> > >> wangguoz@gmail.com>
> > >> > >> > > > wrote:
> > >> > >> > > > >
> > >> > >> > > > > > Made a pass over the doc, overall LGTM except a minor
> > >> comment
> > >> > on
> > >> > >> > the
> > >> > >> > > > > > throttling implementation:
> > >> > >> > > > > >
> > >> > >> > > > > > Stated as "Request processing time throttling will be
> > >> applied
> > >> > on
> > >> > >> > top
> > >> > >> > > if
> > >> > >> > > > > > necessary." I thought that it meant the request
> > processing
> > >> > time
> > >> > >> > > > > throttling
> > >> > >> > > > > > is applied first, but continue reading I found it
> > actually
> > >> > >> meant to
> > >> > >> > > > apply
> > >> > >> > > > > > produce / fetch byte rate throttling first.
> > >> > >> > > > > >
> > >> > >> > > > > > Also the last sentence "The remaining delay if any is
> > >> applied
> > >> > to
> > >> > >> > the
> > >> > >> > > > > > response." is a bit confusing to me. Maybe rewording
> it a
> > >> bit?
> > >> > >> > > > > >
> > >> > >> > > > > >
> > >> > >> > > > > > Guozhang
> > >> > >> > > > > >
> > >> > >> > > > > >
> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> > jun@confluent.io
> > >> >
> > >> > >> wrote:
> > >> > >> > > > > >
> > >> > >> > > > > > > Hi, Rajini,
> > >> > >> > > > > > >
> > >> > >> > > > > > > Thanks for the updated KIP. The latest proposal looks
> > >> good
> > >> > to
> > >> > >> me.
> > >> > >> > > > > > >
> > >> > >> > > > > > > Jun
> > >> > >> > > > > > >
> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > >> > >> > > > > rajinisivaram@gmail.com
> > >> > >> > > > > > >
> > >> > >> > > > > > > wrote:
> > >> > >> > > > > > >
> > >> > >> > > > > > > > Jun/Roger,
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > Thank you for the feedback.
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > 1. I have updated the KIP to use absolute units
> > >> instead of
> > >> > >> > > > > percentage.
> > >> > >> > > > > > > The
> > >> > >> > > > > > > > property is called* io_thread_units* to align with
> > the
> > >> > >> thread
> > >> > >> > > count
> > >> > >> > > > > > > > property *num.io.threads*. When we implement
> network
> > >> > thread
> > >> > >> > > > > utilization
> > >> > >> > > > > > > > quotas, we can add another property
> > >> > *network_thread_units.*
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > 2. ControlledShutdown is already listed under the
> > >> exempt
> > >> > >> > > requests.
> > >> > >> > > > > Jun,
> > >> > >> > > > > > > did
> > >> > >> > > > > > > > you mean a different request that needs to be
> added?
> > >> The
> > >> > >> four
> > >> > >> > > > > requests
> > >> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> > >> > >> > ControlledShutdown,
> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are
> controlled
> > >> > using
> > >> > >> > > > > > ClusterAction
> > >> > >> > > > > > > > ACL, so it is easy to exclude and only throttle if
> > >> > >> > unauthorized.
> > >> > >> > > I
> > >> > >> > > > > > wasn't
> > >> > >> > > > > > > > sure if there are other requests used only for
> > >> > inter-broker
> > >> > >> > that
> > >> > >> > > > > needed
> > >> > >> > > > > > > to
> > >> > >> > > > > > > > be excluded.
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > 3. I was thinking the smallest change would be to
> > >> replace
> > >> > >> all
> > >> > >> > > > > > references
> > >> > >> > > > > > > to
> > >> > >> > > > > > > > *requestChannel.sendResponse()* with a local
> method
> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the
> > throttling
> > >> if
> > >> > >> any
> > >> > >> > > plus
> > >> > >> > > > > send
> > >> > >> > > > > > > > response. If we throttle first in
> > *KafkaApis.handle()*,
> > >> > the
> > >> > >> > time
> > >> > >> > > > > spent
> > >> > >> > > > > > > > within the method handling the request will not be
> > >> > recorded
> > >> > >> or
> > >> > >> > > used
> > >> > >> > > > > in
> > >> > >> > > > > > > > throttling. We can look into this again when the PR
> > is
> > >> > ready
> > >> > >> > for
> > >> > >> > > > > > review.
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > Regards,
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > Rajini
> > >> > >> > > > > > > >
> > >> > >> > > > > > > >
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > >> > >> > > > > roger.hoover@gmail.com>
> > >> > >> > > > > > > > wrote:
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > > Great to see this KIP and the excellent
> discussion.
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
> > >> application
> > >> > is
> > >> > >> > > > > allocated
> > >> > >> > > > > > 1
> > >> > >> > > > > > > > > request handler unit, then it's as if I have a
> > Kafka
> > >> > >> broker
> > >> > >> > > with
> > >> > >> > > > a
> > >> > >> > > > > > > single
> > >> > >> > > > > > > > > request handler thread dedicated to me.  That's
> the
> > >> > most I
> > >> > >> > can
> > >> > >> > > > use,
> > >> > >> > > > > > at
> > >> > >> > > > > > > > > least.  That allocation doesn't change even if an
> > >> admin
> > >> > >> later
> > >> > >> > > > > > increases
> > >> > >> > > > > > > > the
> > >> > >> > > > > > > > > size of the request thread pool on the broker.
> > It's
> > >> > >> similar
> > >> > >> > to
> > >> > >> > > > the
> > >> > >> > > > > > CPU
> > >> > >> > > > > > > > > abstraction that VMs and containers get from
> > >> hypervisors
> > >> > >> or
> > >> > >> > OS
> > >> > >> > > > > > > > schedulers.
> > >> > >> > > > > > > > > While different client access patterns can use
> > wildly
> > >> > >> > different
> > >> > >> > > > > > amounts
> > >> > >> > > > > > > > of
> > >> > >> > > > > > > > > request thread resources per request, a given
> > >> > application
> > >> > >> > will
> > >> > >> > > > > > > generally
> > >> > >> > > > > > > > > have a stable access pattern and can figure out
> > >> > >> empirically
> > >> > >> > how
> > >> > >> > > > > many
> > >> > >> > > > > > > > > "request thread units" it needs to meet it's
> > >> > >> > throughput/latency
> > >> > >> > > > > > goals.
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > Cheers,
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > Roger
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> > >> > >> jun@confluent.io>
> > >> > >> > > > wrote:
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > > Hi, Rajini,
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > Thanks for the updated KIP. A few more
> comments.
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > 1. A concern of request_time_percent is that
> it's
> > >> not
> > >> > an
> > >> > >> > > > absolute
> > >> > >> > > > > > > > value.
> > >> > >> > > > > > > > > > Let's say you give a user a 10% limit. If the
> > admin
> > >> > >> doubles
> > >> > >> > > the
> > >> > >> > > > > > > number
> > >> > >> > > > > > > > of
> > >> > >> > > > > > > > > > request handler threads, that user now actually
> > has
> > >> > >> twice
> > >> > >> > the
> > >> > >> > > > > > > absolute
> > >> > >> > > > > > > > > > capacity. This may confuse people a bit. So,
> > >> perhaps
> > >> > >> > setting
> > >> > >> > > > the
> > >> > >> > > > > > > quota
> > >> > >> > > > > > > > > > based on an absolute request thread unit is
> > better.
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
> > >> inter-broker
> > >> > >> > request
> > >> > >> > > > and
> > >> > >> > > > > > > needs
> > >> > >> > > > > > > > to
> > >> > >> > > > > > > > > > be excluded from throttling.
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > 3. Implementation wise, I am wondering if it's
> > >> simpler
> > >> > >> to
> > >> > >> > > apply
> > >> > >> > > > > the
> > >> > >> > > > > > > > > request
> > >> > >> > > > > > > > > > time throttling first in KafkaApis.handle().
> > >> > Otherwise,
> > >> > >> we
> > >> > >> > > will
> > >> > >> > > > > > need
> > >> > >> > > > > > > to
> > >> > >> > > > > > > > > add
> > >> > >> > > > > > > > > > the throttling logic in each type of request.
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > Thanks,
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > Jun
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini
> Sivaram <
> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > wrote:
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > > Jun,
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > Thank you for the review.
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > I have reverted to the original KIP that
> > >> throttles
> > >> > >> based
> > >> > >> > on
> > >> > >> > > > > > request
> > >> > >> > > > > > > > > > handler
> > >> > >> > > > > > > > > > > utilization. At the moment, it uses
> percentage,
> > >> but
> > >> > I
> > >> > >> am
> > >> > >> > > > happy
> > >> > >> > > > > to
> > >> > >> > > > > > > > > change
> > >> > >> > > > > > > > > > to
> > >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
> > >> required. I
> > >> > >> have
> > >> > >> > > > added
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > examples
> > >> > >> > > > > > > > > > > from this discussion to the KIP. Also added a
> > >> > "Future
> > >> > >> > Work"
> > >> > >> > > > > > section
> > >> > >> > > > > > > > to
> > >> > >> > > > > > > > > > > address network thread utilization. The
> > >> > configuration
> > >> > >> is
> > >> > >> > > > named
> > >> > >> > > > > > > > > > > "request_time_percent" with the expectation
> > that
> > >> it
> > >> > >> can
> > >> > >> > > also
> > >> > >> > > > be
> > >> > >> > > > > > > used
> > >> > >> > > > > > > > as
> > >> > >> > > > > > > > > > the
> > >> > >> > > > > > > > > > > limit for network thread utilization when
> that
> > is
> > >> > >> > > > implemented,
> > >> > >> > > > > so
> > >> > >> > > > > > > > that
> > >> > >> > > > > > > > > > > users have to set only one config for the two
> > and
> > >> > not
> > >> > >> > have
> > >> > >> > > to
> > >> > >> > > > > > worry
> > >> > >> > > > > > > > > about
> > >> > >> > > > > > > > > > > the internal distribution of the work between
> > the
> > >> > two
> > >> > >> > > thread
> > >> > >> > > > > > pools
> > >> > >> > > > > > > in
> > >> > >> > > > > > > > > > > Kafka.
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > Regards,
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > Rajini
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
> > >> > >> > > jun@confluent.io>
> > >> > >> > > > > > > wrote:
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > The benefit of using the request processing
> > >> time
> > >> > >> over
> > >> > >> > the
> > >> > >> > > > > > request
> > >> > >> > > > > > > > > rate
> > >> > >> > > > > > > > > > is
> > >> > >> > > > > > > > > > > > exactly what people have said. I will just
> > >> expand
> > >> > >> that
> > >> > >> > a
> > >> > >> > > > bit.
> > >> > >> > > > > > > > > Consider
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > following case. The producer sends a
> produce
> > >> > request
> > >> > >> > > with a
> > >> > >> > > > > > 10MB
> > >> > >> > > > > > > > > > message
> > >> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> > >> > >> decompression of
> > >> > >> > > the
> > >> > >> > > > > > > message
> > >> > >> > > > > > > > > on
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > broker could take 10-15 seconds, during
> which
> > >> > time,
> > >> > >> a
> > >> > >> > > > request
> > >> > >> > > > > > > > handler
> > >> > >> > > > > > > > > > > > thread is completely blocked. In this case,
> > >> > neither
> > >> > >> the
> > >> > >> > > > > byte-in
> > >> > >> > > > > > > > quota
> > >> > >> > > > > > > > > > nor
> > >> > >> > > > > > > > > > > > the request rate quota may be effective in
> > >> > >> protecting
> > >> > >> > the
> > >> > >> > > > > > broker.
> > >> > >> > > > > > > > > > > Consider
> > >> > >> > > > > > > > > > > > another case. A consumer group starts with
> 10
> > >> > >> instances
> > >> > >> > > and
> > >> > >> > > > > > later
> > >> > >> > > > > > > > on
> > >> > >> > > > > > > > > > > > switches to 20 instances. The request rate
> > will
> > >> > >> likely
> > >> > >> > > > > double,
> > >> > >> > > > > > > but
> > >> > >> > > > > > > > > the
> > >> > >> > > > > > > > > > > > actually load on the broker may not double
> > >> since
> > >> > >> each
> > >> > >> > > fetch
> > >> > >> > > > > > > request
> > >> > >> > > > > > > > > > only
> > >> > >> > > > > > > > > > > > contains half of the partitions. Request
> rate
> > >> > quota
> > >> > >> may
> > >> > >> > > not
> > >> > >> > > > > be
> > >> > >> > > > > > > easy
> > >> > >> > > > > > > > > to
> > >> > >> > > > > > > > > > > > configure in this case.
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > What we really want is to be able to
> prevent
> > a
> > >> > >> client
> > >> > >> > > from
> > >> > >> > > > > > using
> > >> > >> > > > > > > > too
> > >> > >> > > > > > > > > > much
> > >> > >> > > > > > > > > > > > of the server side resources. In this
> > >> particular
> > >> > >> KIP,
> > >> > >> > > this
> > >> > >> > > > > > > resource
> > >> > >> > > > > > > > > is
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > capacity of the request handler threads. I
> > >> agree
> > >> > >> that
> > >> > >> > it
> > >> > >> > > > may
> > >> > >> > > > > > not
> > >> > >> > > > > > > be
> > >> > >> > > > > > > > > > > > intuitive for the users to determine how to
> > set
> > >> > the
> > >> > >> > right
> > >> > >> > > > > > limit.
> > >> > >> > > > > > > > > > However,
> > >> > >> > > > > > > > > > > > this is not completely new and has been
> done
> > in
> > >> > the
> > >> > >> > > > container
> > >> > >> > > > > > > world
> > >> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> > >> > >> > > > > https://access.redhat.com/
> > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > >> > >> terprise_Linux/6/html/
> > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html)
> has
> > >> the
> > >> > >> > concept
> > >> > >> > > of
> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > >> > >> > > > > > > > > > > > which specifies the total amount of time in
> > >> > >> > microseconds
> > >> > >> > > > for
> > >> > >> > > > > > > which
> > >> > >> > > > > > > > > all
> > >> > >> > > > > > > > > > > > tasks in a cgroup can run during a one
> second
> > >> > >> period.
> > >> > >> > We
> > >> > >> > > > can
> > >> > >> > > > > > > > > > potentially
> > >> > >> > > > > > > > > > > > model the request handler threads in a
> > similar
> > >> > way.
> > >> > >> For
> > >> > >> > > > > > example,
> > >> > >> > > > > > > > each
> > >> > >> > > > > > > > > > > > request handler thread can be 1 request
> > handler
> > >> > unit
> > >> > >> > and
> > >> > >> > > > the
> > >> > >> > > > > > > admin
> > >> > >> > > > > > > > > can
> > >> > >> > > > > > > > > > > > configure a limit on how many units (say
> > 0.01)
> > >> a
> > >> > >> client
> > >> > >> > > can
> > >> > >> > > > > > have.
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Regarding not throttling the internal
> broker
> > to
> > >> > >> broker
> > >> > >> > > > > > requests.
> > >> > >> > > > > > > We
> > >> > >> > > > > > > > > > could
> > >> > >> > > > > > > > > > > > do that. Alternatively, we could just let
> the
> > >> > admin
> > >> > >> > > > > configure a
> > >> > >> > > > > > > > high
> > >> > >> > > > > > > > > > > limit
> > >> > >> > > > > > > > > > > > for the kafka user (it may not be able to
> do
> > >> that
> > >> > >> > easily
> > >> > >> > > > > based
> > >> > >> > > > > > on
> > >> > >> > > > > > > > > > > clientId
> > >> > >> > > > > > > > > > > > though).
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Ideally we want to be able to protect the
> > >> > >> utilization
> > >> > >> > of
> > >> > >> > > > the
> > >> > >> > > > > > > > network
> > >> > >> > > > > > > > > > > thread
> > >> > >> > > > > > > > > > > > pool too. The difficult is mostly what
> Rajini
> > >> > said:
> > >> > >> (1)
> > >> > >> > > The
> > >> > >> > > > > > > > mechanism
> > >> > >> > > > > > > > > > for
> > >> > >> > > > > > > > > > > > throttling the requests is through
> Purgatory
> > >> and
> > >> > we
> > >> > >> > will
> > >> > >> > > > have
> > >> > >> > > > > > to
> > >> > >> > > > > > > > > think
> > >> > >> > > > > > > > > > > > through how to integrate that into the
> > network
> > >> > >> layer.
> > >> > >> > > (2)
> > >> > >> > > > In
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > network
> > >> > >> > > > > > > > > > > > layer, currently we know the user, but not
> > the
> > >> > >> clientId
> > >> > >> > > of
> > >> > >> > > > > the
> > >> > >> > > > > > > > > request.
> > >> > >> > > > > > > > > > > So,
> > >> > >> > > > > > > > > > > > it's a bit tricky to throttle based on
> > clientId
> > >> > >> there.
> > >> > >> > > > Plus,
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > byteOut
> > >> > >> > > > > > > > > > > > quota can already protect the network
> thread
> > >> > >> > utilization
> > >> > >> > > > for
> > >> > >> > > > > > > fetch
> > >> > >> > > > > > > > > > > > requests. So, if we can't figure out this
> > part
> > >> > right
> > >> > >> > now,
> > >> > >> > > > > just
> > >> > >> > > > > > > > > focusing
> > >> > >> > > > > > > > > > > on
> > >> > >> > > > > > > > > > > > the request handling threads for this KIP
> is
> > >> > still a
> > >> > >> > > useful
> > >> > >> > > > > > > > feature.
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Thanks,
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Jun
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini
> > >> Sivaram <
> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > Thank you all for the feedback.
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > Jay: I have removed exemption for
> consumer
> > >> > >> heartbeat
> > >> > >> > > etc.
> > >> > >> > > > > > Agree
> > >> > >> > > > > > > > > that
> > >> > >> > > > > > > > > > > > > protecting the cluster is more important
> > than
> > >> > >> > > protecting
> > >> > >> > > > > > > > individual
> > >> > >> > > > > > > > > > > apps.
> > >> > >> > > > > > > > > > > > > Have retained the exemption for
> > >> > >> > > StopReplicat/LeaderAndIsr
> > >> > >> > > > > > etc,
> > >> > >> > > > > > > > > these
> > >> > >> > > > > > > > > > > are
> > >> > >> > > > > > > > > > > > > throttled only if authorization fails (so
> > >> can't
> > >> > be
> > >> > >> > used
> > >> > >> > > > for
> > >> > >> > > > > > DoS
> > >> > >> > > > > > > > > > attacks
> > >> > >> > > > > > > > > > > > in
> > >> > >> > > > > > > > > > > > > a secure cluster, but allows inter-broker
> > >> > >> requests to
> > >> > >> > > > > > complete
> > >> > >> > > > > > > > > > without
> > >> > >> > > > > > > > > > > > > delays).
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > I will wait another day to see if these
> is
> > >> any
> > >> > >> > > objection
> > >> > >> > > > to
> > >> > >> > > > > > > > quotas
> > >> > >> > > > > > > > > > > based
> > >> > >> > > > > > > > > > > > on
> > >> > >> > > > > > > > > > > > > request processing time (as opposed to
> > >> request
> > >> > >> rate)
> > >> > >> > > and
> > >> > >> > > > if
> > >> > >> > > > > > > there
> > >> > >> > > > > > > > > are
> > >> > >> > > > > > > > > > > no
> > >> > >> > > > > > > > > > > > > objections, I will revert to the original
> > >> > proposal
> > >> > >> > with
> > >> > >> > > > > some
> > >> > >> > > > > > > > > changes.
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > The original proposal was only including
> > the
> > >> > time
> > >> > >> > used
> > >> > >> > > by
> > >> > >> > > > > the
> > >> > >> > > > > > > > > request
> > >> > >> > > > > > > > > > > > > handler threads (that made calculation
> > >> easy). I
> > >> > >> think
> > >> > >> > > the
> > >> > >> > > > > > > > > suggestion
> > >> > >> > > > > > > > > > is
> > >> > >> > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > include the time spent in the network
> > >> threads as
> > >> > >> well
> > >> > >> > > > since
> > >> > >> > > > > > > that
> > >> > >> > > > > > > > > may
> > >> > >> > > > > > > > > > be
> > >> > >> > > > > > > > > > > > > significant. As Jay pointed out, it is
> more
> > >> > >> > complicated
> > >> > >> > > > to
> > >> > >> > > > > > > > > calculate
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > total available CPU time and convert to a
> > >> ratio
> > >> > >> when
> > >> > >> > > > there
> > >> > >> > > > > > *m*
> > >> > >> > > > > > > > I/O
> > >> > >> > > > > > > > > > > > threads
> > >> > >> > > > > > > > > > > > > and *n* network threads.
> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > >> > >> > > )
> > >> > >> > > > > may
> > >> > >> > > > > > > > give
> > >> > >> > > > > > > > > us
> > >> > >> > > > > > > > > > > > what
> > >> > >> > > > > > > > > > > > > we want, but it can be very expensive on
> > some
> > >> > >> > > platforms.
> > >> > >> > > > As
> > >> > >> > > > > > > > Becket
> > >> > >> > > > > > > > > > and
> > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have
> > several
> > >> > time
> > >> > >> > > > > > measurements
> > >> > >> > > > > > > > > > already
> > >> > >> > > > > > > > > > > > for
> > >> > >> > > > > > > > > > > > > generating metrics that we could use,
> > though
> > >> we
> > >> > >> might
> > >> > >> > > > want
> > >> > >> > > > > to
> > >> > >> > > > > > > > > switch
> > >> > >> > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis()
> > >> since
> > >> > >> some
> > >> > >> > of
> > >> > >> > > > the
> > >> > >> > > > > > > > values
> > >> > >> > > > > > > > > > for
> > >> > >> > > > > > > > > > > > > small requests may be < 1ms. But rather
> > than
> > >> add
> > >> > >> up
> > >> > >> > the
> > >> > >> > > > > time
> > >> > >> > > > > > > > spent
> > >> > >> > > > > > > > > in
> > >> > >> > > > > > > > > > > I/O
> > >> > >> > > > > > > > > > > > > thread and network thread, wouldn't it be
> > >> better
> > >> > >> to
> > >> > >> > > > convert
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > time
> > >> > >> > > > > > > > > > > > spent
> > >> > >> > > > > > > > > > > > > on each thread into a separate ratio?
> UserA
> > >> has
> > >> > a
> > >> > >> > > request
> > >> > >> > > > > > quota
> > >> > >> > > > > > > > of
> > >> > >> > > > > > > > > > 5%.
> > >> > >> > > > > > > > > > > > Can
> > >> > >> > > > > > > > > > > > > we take that to mean that UserA can use
> 5%
> > of
> > >> > the
> > >> > >> > time
> > >> > >> > > on
> > >> > >> > > > > > > network
> > >> > >> > > > > > > > > > > threads
> > >> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If
> > either
> > >> is
> > >> > >> > > exceeded,
> > >> > >> > > > > the
> > >> > >> > > > > > > > > > response
> > >> > >> > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > throttled - it would mean maintaining two
> > >> sets
> > >> > of
> > >> > >> > > metrics
> > >> > >> > > > > for
> > >> > >> > > > > > > the
> > >> > >> > > > > > > > > two
> > >> > >> > > > > > > > > > > > > durations, but would result in more
> > >> meaningful
> > >> > >> > ratios.
> > >> > >> > > We
> > >> > >> > > > > > could
> > >> > >> > > > > > > > > > define
> > >> > >> > > > > > > > > > > > two
> > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request
> > threads
> > >> > and
> > >> > >> 10%
> > >> > >> > > of
> > >> > >> > > > > > > network
> > >> > >> > > > > > > > > > > > threads),
> > >> > >> > > > > > > > > > > > > but that seems unnecessary and harder to
> > >> explain
> > >> > >> to
> > >> > >> > > > users.
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > Back to why and how quotas are applied to
> > >> > network
> > >> > >> > > thread
> > >> > >> > > > > > > > > utilization:
> > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the time spent
> in
> > >> the
> > >> > >> > network
> > >> > >> > > > > > thread
> > >> > >> > > > > > > > may
> > >> > >> > > > > > > > > be
> > >> > >> > > > > > > > > > > > > significant and I can see the need to
> > include
> > >> > >> this.
> > >> > >> > Are
> > >> > >> > > > > there
> > >> > >> > > > > > > > other
> > >> > >> > > > > > > > > > > > > requests where the network thread
> > >> utilization is
> > >> > >> > > > > significant?
> > >> > >> > > > > > > In
> > >> > >> > > > > > > > > the
> > >> > >> > > > > > > > > > > case
> > >> > >> > > > > > > > > > > > > of fetch, request handler thread
> > utilization
> > >> > would
> > >> > >> > > > throttle
> > >> > >> > > > > > > > clients
> > >> > >> > > > > > > > > > > with
> > >> > >> > > > > > > > > > > > > high request rate, low data volume and
> > fetch
> > >> > byte
> > >> > >> > rate
> > >> > >> > > > > quota
> > >> > >> > > > > > > will
> > >> > >> > > > > > > > > > > > throttle
> > >> > >> > > > > > > > > > > > > clients with high data volume. Network
> > thread
> > >> > >> > > utilization
> > >> > >> > > > > is
> > >> > >> > > > > > > > > perhaps
> > >> > >> > > > > > > > > > > > > proportional to the data volume. I am
> > >> wondering
> > >> > >> if we
> > >> > >> > > > even
> > >> > >> > > > > > need
> > >> > >> > > > > > > > to
> > >> > >> > > > > > > > > > > > throttle
> > >> > >> > > > > > > > > > > > > based on network thread utilization or
> > >> whether
> > >> > the
> > >> > >> > data
> > >> > >> > > > > > volume
> > >> > >> > > > > > > > > quota
> > >> > >> > > > > > > > > > > > covers
> > >> > >> > > > > > > > > > > > > this case.
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > b) At the moment, we record and check for
> > >> quota
> > >> > >> > > violation
> > >> > >> > > > > at
> > >> > >> > > > > > > the
> > >> > >> > > > > > > > > same
> > >> > >> > > > > > > > > > > > time.
> > >> > >> > > > > > > > > > > > > If a quota is violated, the response is
> > >> delayed.
> > >> > >> > Using
> > >> > >> > > > > Jay'e
> > >> > >> > > > > > > > > example
> > >> > >> > > > > > > > > > of
> > >> > >> > > > > > > > > > > > > disk reads for fetches happening in the
> > >> network
> > >> > >> > thread,
> > >> > >> > > > We
> > >> > >> > > > > > > can't
> > >> > >> > > > > > > > > > record
> > >> > >> > > > > > > > > > > > and
> > >> > >> > > > > > > > > > > > > delay a response after the disk reads. We
> > >> could
> > >> > >> > record
> > >> > >> > > > the
> > >> > >> > > > > > time
> > >> > >> > > > > > > > > spent
> > >> > >> > > > > > > > > > > on
> > >> > >> > > > > > > > > > > > > the network thread when the response is
> > >> complete
> > >> > >> and
> > >> > >> > > > > > introduce
> > >> > >> > > > > > > a
> > >> > >> > > > > > > > > > delay
> > >> > >> > > > > > > > > > > > for
> > >> > >> > > > > > > > > > > > > handling a subsequent request (separate
> out
> > >> > >> recording
> > >> > >> > > and
> > >> > >> > > > > > quota
> > >> > >> > > > > > > > > > > violation
> > >> > >> > > > > > > > > > > > > handling in the case of network thread
> > >> > overload).
> > >> > >> > Does
> > >> > >> > > > that
> > >> > >> > > > > > > make
> > >> > >> > > > > > > > > > sense?
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > Regards,
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > Rajini
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket
> > Qin <
> > >> > >> > > > > > > > becket.qin@gmail.com>
> > >> > >> > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU
> time
> > >> is a
> > >> > >> > little
> > >> > >> > > > > > > tricky. I
> > >> > >> > > > > > > > > am
> > >> > >> > > > > > > > > > > > > thinking
> > >> > >> > > > > > > > > > > > > > that maybe we can use the existing
> > request
> > >> > >> > > statistics.
> > >> > >> > > > > They
> > >> > >> > > > > > > are
> > >> > >> > > > > > > > > > > already
> > >> > >> > > > > > > > > > > > > > very detailed so we can probably see
> the
> > >> > >> > approximate
> > >> > >> > > > CPU
> > >> > >> > > > > > time
> > >> > >> > > > > > > > > from
> > >> > >> > > > > > > > > > > it,
> > >> > >> > > > > > > > > > > > > e.g.
> > >> > >> > > > > > > > > > > > > > something like (total_time -
> > >> > >> > > > request/response_queue_time
> > >> > >> > > > > -
> > >> > >> > > > > > > > > > > > remote_time).
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > I agree with Guozhang that when a user
> is
> > >> > >> throttled
> > >> > >> > > it
> > >> > >> > > > is
> > >> > >> > > > > > > > likely
> > >> > >> > > > > > > > > > that
> > >> > >> > > > > > > > > > > > we
> > >> > >> > > > > > > > > > > > > > need to see if anything has went wrong
> > >> first,
> > >> > >> and
> > >> > >> > if
> > >> > >> > > > the
> > >> > >> > > > > > > users
> > >> > >> > > > > > > > > are
> > >> > >> > > > > > > > > > > well
> > >> > >> > > > > > > > > > > > > > behaving and just need more resources,
> we
> > >> will
> > >> > >> have
> > >> > >> > > to
> > >> > >> > > > > bump
> > >> > >> > > > > > > up
> > >> > >> > > > > > > > > the
> > >> > >> > > > > > > > > > > > quota
> > >> > >> > > > > > > > > > > > > > for them. It is true that
> pre-allocating
> > >> CPU
> > >> > >> time
> > >> > >> > > quota
> > >> > >> > > > > > > > precisely
> > >> > >> > > > > > > > > > for
> > >> > >> > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > users is difficult. So in practice it
> > would
> > >> > >> > probably
> > >> > >> > > be
> > >> > >> > > > > > more
> > >> > >> > > > > > > > like
> > >> > >> > > > > > > > > > > first
> > >> > >> > > > > > > > > > > > > set
> > >> > >> > > > > > > > > > > > > > a relative high protective CPU time
> quota
> > >> for
> > >> > >> > > everyone
> > >> > >> > > > > and
> > >> > >> > > > > > > > > increase
> > >> > >> > > > > > > > > > > > that
> > >> > >> > > > > > > > > > > > > > for some individual clients on demand.
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > Thanks,
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM,
> Guozhang
> > >> > Wang <
> > >> > >> > > > > > > > > wangguoz@gmail.com
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > This is a great proposal, glad to see
> > it
> > >> > >> > happening.
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU throttling,
> or
> > >> more
> > >> > >> > > > > specifically
> > >> > >> > > > > > > > > > > processing
> > >> > >> > > > > > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > > ratio instead of the request rate
> > >> throttling
> > >> > >> as
> > >> > >> > > well.
> > >> > >> > > > > > > Becket
> > >> > >> > > > > > > > > has
> > >> > >> > > > > > > > > > > very
> > >> > >> > > > > > > > > > > > > > well
> > >> > >> > > > > > > > > > > > > > > summed my rationales above, and one
> > >> thing to
> > >> > >> add
> > >> > >> > > here
> > >> > >> > > > > is
> > >> > >> > > > > > > that
> > >> > >> > > > > > > > > the
> > >> > >> > > > > > > > > > > > > former
> > >> > >> > > > > > > > > > > > > > > has a good support for both
> "protecting
> > >> > >> against
> > >> > >> > > rogue
> > >> > >> > > > > > > > clients"
> > >> > >> > > > > > > > > as
> > >> > >> > > > > > > > > > > > well
> > >> > >> > > > > > > > > > > > > as
> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> multi-tenancy
> > >> > usage":
> > >> > >> > when
> > >> > >> > > > > > > thinking
> > >> > >> > > > > > > > > > about
> > >> > >> > > > > > > > > > > > how
> > >> > >> > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > explain this to the end users, I find
> > it
> > >> > >> actually
> > >> > >> > > > more
> > >> > >> > > > > > > > natural
> > >> > >> > > > > > > > > > than
> > >> > >> > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > request rate since as mentioned
> above,
> > >> > >> different
> > >> > >> > > > > requests
> > >> > >> > > > > > > > will
> > >> > >> > > > > > > > > > have
> > >> > >> > > > > > > > > > > > > quite
> > >> > >> > > > > > > > > > > > > > > different "cost", and Kafka today
> > already
> > >> > have
> > >> > >> > > > various
> > >> > >> > > > > > > > request
> > >> > >> > > > > > > > > > > types
> > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata,
> etc),
> > >> > >> because
> > >> > >> > of
> > >> > >> > > > that
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > rate
> > >> > >> > > > > > > > > > > > > > > throttling may not be as effective
> > >> unless it
> > >> > >> is
> > >> > >> > set
> > >> > >> > > > > very
> > >> > >> > > > > > > > > > > > > conservatively.
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > Regarding to user reactions when they
> > are
> > >> > >> > > throttled,
> > >> > >> > > > I
> > >> > >> > > > > > > think
> > >> > >> > > > > > > > it
> > >> > >> > > > > > > > > > may
> > >> > >> > > > > > > > > > > > > > differ
> > >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
> > discovered /
> > >> > >> guided
> > >> > >> > by
> > >> > >> > > > > > looking
> > >> > >> > > > > > > > at
> > >> > >> > > > > > > > > > > > relative
> > >> > >> > > > > > > > > > > > > > > metrics. So in other words users
> would
> > >> not
> > >> > >> expect
> > >> > >> > > to
> > >> > >> > > > > get
> > >> > >> > > > > > > > > > additional
> > >> > >> > > > > > > > > > > > > > > information by simply being told
> "hey,
> > >> you
> > >> > are
> > >> > >> > > > > > throttled",
> > >> > >> > > > > > > > > which
> > >> > >> > > > > > > > > > is
> > >> > >> > > > > > > > > > > > all
> > >> > >> > > > > > > > > > > > > > > what throttling does; they need to
> > take a
> > >> > >> > follow-up
> > >> > >> > > > > step
> > >> > >> > > > > > > and
> > >> > >> > > > > > > > > see
> > >> > >> > > > > > > > > > > > "hmm,
> > >> > >> > > > > > > > > > > > > > I'm
> > >> > >> > > > > > > > > > > > > > > throttled probably because of ..",
> > which
> > >> is
> > >> > by
> > >> > >> > > > looking
> > >> > >> > > > > at
> > >> > >> > > > > > > > other
> > >> > >> > > > > > > > > > > > metric
> > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding
> the
> > >> > >> brokers
> > >> > >> > > with
> > >> > >> > > > > > > metadata
> > >> > >> > > > > > > > > > > > request,
> > >> > >> > > > > > > > > > > > > > > which are usually cheap to handle but
> > I'm
> > >> > >> sending
> > >> > >> > > > > > thousands
> > >> > >> > > > > > > > per
> > >> > >> > > > > > > > > > > > second;
> > >> > >> > > > > > > > > > > > > > or
> > >> > >> > > > > > > > > > > > > > > is it because I'm catching up and
> hence
> > >> > >> sending
> > >> > >> > > very
> > >> > >> > > > > > heavy
> > >> > >> > > > > > > > > > fetching
> > >> > >> > > > > > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > > > with large min.bytes, etc.
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > Regarding to the implementation, as
> > once
> > >> > >> > discussed
> > >> > >> > > > with
> > >> > >> > > > > > > Jun,
> > >> > >> > > > > > > > > this
> > >> > >> > > > > > > > > > > > seems
> > >> > >> > > > > > > > > > > > > > not
> > >> > >> > > > > > > > > > > > > > > very difficult since today we are
> > already
> > >> > >> > > collecting
> > >> > >> > > > > the
> > >> > >> > > > > > > > > "thread
> > >> > >> > > > > > > > > > > pool
> > >> > >> > > > > > > > > > > > > > > utilization" metrics, which is a
> single
> > >> > >> > percentage
> > >> > >> > > > > > > > > > > > "aggregateIdleMeter"
> > >> > >> > > > > > > > > > > > > > > value; but we are already effectively
> > >> > >> aggregating
> > >> > >> > > it
> > >> > >> > > > > for
> > >> > >> > > > > > > each
> > >> > >> > > > > > > > > > > > requests
> > >> > >> > > > > > > > > > > > > in
> > >> > >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just
> > >> extend
> > >> > >> it by
> > >> > >> > > > > > recording
> > >> > >> > > > > > > > the
> > >> > >> > > > > > > > > > > > source
> > >> > >> > > > > > > > > > > > > > > client id when handling them and
> > >> aggregating
> > >> > >> by
> > >> > >> > > > > clientId
> > >> > >> > > > > > as
> > >> > >> > > > > > > > > well
> > >> > >> > > > > > > > > > as
> > >> > >> > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > total aggregate.
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > Guozhang
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay
> > >> Kreps <
> > >> > >> > > > > > > jay@confluent.io
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > When I thought about it more
> deeply I
> > >> came
> > >> > >> > around
> > >> > >> > > > to
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > "percent
> > >> > >> > > > > > > > > > > > of
> > >> > >> > > > > > > > > > > > > > > > processing time" metric too. It
> > seems a
> > >> > lot
> > >> > >> > > closer
> > >> > >> > > > to
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > thing
> > >> > >> > > > > > > > > > > we
> > >> > >> > > > > > > > > > > > > > > actually
> > >> > >> > > > > > > > > > > > > > > > care about and need to protect. I
> > also
> > >> > think
> > >> > >> > this
> > >> > >> > > > > would
> > >> > >> > > > > > > be
> > >> > >> > > > > > > > a
> > >> > >> > > > > > > > > > very
> > >> > >> > > > > > > > > > > > > > useful
> > >> > >> > > > > > > > > > > > > > > > metric even in the absence of
> > >> throttling
> > >> > >> just
> > >> > >> > to
> > >> > >> > > > > debug
> > >> > >> > > > > > > > whose
> > >> > >> > > > > > > > > > > using
> > >> > >> > > > > > > > > > > > > > > > capacity.
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > Two problems to consider:
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > >    1. I agree that for the user it
> is
> > >> > >> > > > understandable
> > >> > >> > > > > > what
> > >> > >> > > > > > > > > lead
> > >> > >> > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > their
> > >> > >> > > > > > > > > > > > > > > >    being throttled, but it is a bit
> > >> hard
> > >> > to
> > >> > >> > > figure
> > >> > >> > > > > out
> > >> > >> > > > > > > the
> > >> > >> > > > > > > > > safe
> > >> > >> > > > > > > > > > > > range
> > >> > >> > > > > > > > > > > > > > for
> > >> > >> > > > > > > > > > > > > > > >    them. i.e. if I have a new app
> > that
> > >> > will
> > >> > >> > send
> > >> > >> > > > 200
> > >> > >> > > > > > > > > > > messages/sec I
> > >> > >> > > > > > > > > > > > > can
> > >> > >> > > > > > > > > > > > > > > >    probably reason that I'll be
> under
> > >> the
> > >> > >> > > > throttling
> > >> > >> > > > > > > limit
> > >> > >> > > > > > > > of
> > >> > >> > > > > > > > > > 300
> > >> > >> > > > > > > > > > > > > > > req/sec.
> > >> > >> > > > > > > > > > > > > > > >    However if I need to be under a
> > 10%
> > >> CPU
> > >> > >> > > > resources
> > >> > >> > > > > > > limit
> > >> > >> > > > > > > > it
> > >> > >> > > > > > > > > > may
> > >> > >> > > > > > > > > > > > be
> > >> > >> > > > > > > > > > > > > a
> > >> > >> > > > > > > > > > > > > > > bit
> > >> > >> > > > > > > > > > > > > > > >    harder for me to know a priori
> if
> > i
> > >> > will
> > >> > >> or
> > >> > >> > > > won't.
> > >> > >> > > > > > > > > > > > > > > >    2. Calculating the available CPU
> > >> time
> > >> > is
> > >> > >> a
> > >> > >> > bit
> > >> > >> > > > > > > difficult
> > >> > >> > > > > > > > > > since
> > >> > >> > > > > > > > > > > > > there
> > >> > >> > > > > > > > > > > > > > > are
> > >> > >> > > > > > > > > > > > > > > >    actually two thread pools--the
> I/O
> > >> > >> threads
> > >> > >> > and
> > >> > >> > > > the
> > >> > >> > > > > > > > network
> > >> > >> > > > > > > > > > > > > threads.
> > >> > >> > > > > > > > > > > > > > I
> > >> > >> > > > > > > > > > > > > > > > think
> > >> > >> > > > > > > > > > > > > > > >    it might be workable to count
> just
> > >> the
> > >> > >> I/O
> > >> > >> > > > thread
> > >> > >> > > > > > time
> > >> > >> > > > > > > > as
> > >> > >> > > > > > > > > in
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > > proposal,
> > >> > >> > > > > > > > > > > > > > > >    but the network thread work is
> > >> actually
> > >> > >> > > > > non-trivial
> > >> > >> > > > > > > > (e.g.
> > >> > >> > > > > > > > > > all
> > >> > >> > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > disk
> > >> > >> > > > > > > > > > > > > > > >    reads for fetches happen in that
> > >> > >> thread). If
> > >> > >> > > you
> > >> > >> > > > > > count
> > >> > >> > > > > > > > > both
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > network
> > >> > >> > > > > > > > > > > > > > > > and
> > >> > >> > > > > > > > > > > > > > > >    I/O threads it can skew things a
> > >> bit.
> > >> > >> E.g.
> > >> > >> > say
> > >> > >> > > > you
> > >> > >> > > > > > > have
> > >> > >> > > > > > > > 50
> > >> > >> > > > > > > > > > > > network
> > >> > >> > > > > > > > > > > > > > > > threads,
> > >> > >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores,
> what
> > is
> > >> > the
> > >> > >> > > > available
> > >> > >> > > > > > cpu
> > >> > >> > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > available
> > >> > >> > > > > > > > > > > > > > > > in a
> > >> > >> > > > > > > > > > > > > > > >    second? I suppose this is a
> > problem
> > >> > >> whenever
> > >> > >> > > you
> > >> > >> > > > > > have
> > >> > >> > > > > > > a
> > >> > >> > > > > > > > > > > > bottleneck
> > >> > >> > > > > > > > > > > > > > > > between
> > >> > >> > > > > > > > > > > > > > > >    I/O and network threads or if
> you
> > >> end
> > >> > up
> > >> > >> > > > > > significantly
> > >> > >> > > > > > > > > > > > > > > over-provisioning
> > >> > >> > > > > > > > > > > > > > > >    one pool (both of which are hard
> > to
> > >> > >> avoid).
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > An alternative for CPU throttling
> > >> would be
> > >> > >> to
> > >> > >> > use
> > >> > >> > > > > this
> > >> > >> > > > > > > api:
> > >> > >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> > >> > >> > > > > > 1.5.0/docs/api/java/lang/
> > >> > >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
> > >> > >> > > > getThreadCpuTime(long)
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > That would let you track actual CPU
> > >> usage
> > >> > >> > across
> > >> > >> > > > the
> > >> > >> > > > > > > > network,
> > >> > >> > > > > > > > > > I/O
> > >> > >> > > > > > > > > > > > > > > threads,
> > >> > >> > > > > > > > > > > > > > > > and purgatory threads and look at
> it
> > >> as a
> > >> > >> > > > percentage
> > >> > >> > > > > of
> > >> > >> > > > > > > > total
> > >> > >> > > > > > > > > > > > cores.
> > >> > >> > > > > > > > > > > > > I
> > >> > >> > > > > > > > > > > > > > > > think this fixes many problems in
> the
> > >> > >> > reliability
> > >> > >> > > > of
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > metric.
> > >> > >> > > > > > > > > > > > It's
> > >> > >> > > > > > > > > > > > > > > > meaning is slightly different as it
> > is
> > >> > just
> > >> > >> CPU
> > >> > >> > > > (you
> > >> > >> > > > > > > don't
> > >> > >> > > > > > > > > get
> > >> > >> > > > > > > > > > > > > charged
> > >> > >> > > > > > > > > > > > > > > for
> > >> > >> > > > > > > > > > > > > > > > time blocking on I/O) but that may
> be
> > >> okay
> > >> > >> > > because
> > >> > >> > > > we
> > >> > >> > > > > > > > already
> > >> > >> > > > > > > > > > > have
> > >> > >> > > > > > > > > > > > a
> > >> > >> > > > > > > > > > > > > > > > throttle on I/O. The downside is I
> > >> think
> > >> > it
> > >> > >> is
> > >> > >> > > > > possible
> > >> > >> > > > > > > > this
> > >> > >> > > > > > > > > > api
> > >> > >> > > > > > > > > > > > can
> > >> > >> > > > > > > > > > > > > be
> > >> > >> > > > > > > > > > > > > > > > disabled or isn't always available
> > and
> > >> it
> > >> > >> may
> > >> > >> > > also
> > >> > >> > > > be
> > >> > >> > > > > > > > > expensive
> > >> > >> > > > > > > > > > > > (also
> > >> > >> > > > > > > > > > > > > > > I've
> > >> > >> > > > > > > > > > > > > > > > never used it so not sure if it
> > really
> > >> > works
> > >> > >> > the
> > >> > >> > > > way
> > >> > >> > > > > i
> > >> > >> > > > > > > > > think).
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > -Jay
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM,
> > Becket
> > >> > Qin
> > >> > >> <
> > >> > >> > > > > > > > > > > becket.qin@gmail.com>
> > >> > >> > > > > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > If the purpose of the KIP is only
> > to
> > >> > >> protect
> > >> > >> > > the
> > >> > >> > > > > > > cluster
> > >> > >> > > > > > > > > from
> > >> > >> > > > > > > > > > > > being
> > >> > >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and
> is
> > >> not
> > >> > >> > > intended
> > >> > >> > > > to
> > >> > >> > > > > > > > address
> > >> > >> > > > > > > > > > > > > resource
> > >> > >> > > > > > > > > > > > > > > > > allocation problem among the
> > >> clients, I
> > >> > am
> > >> > >> > > > > wondering
> > >> > >> > > > > > if
> > >> > >> > > > > > > > > using
> > >> > >> > > > > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > > > > > handling time quota (CPU time
> > quota)
> > >> is
> > >> > a
> > >> > >> > > better
> > >> > >> > > > > > > option.
> > >> > >> > > > > > > > > Here
> > >> > >> > > > > > > > > > > are
> > >> > >> > > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > > > reasons:
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > 1. request handling time quota
> has
> > >> > better
> > >> > >> > > > > protection.
> > >> > >> > > > > > > Say
> > >> > >> > > > > > > > > we
> > >> > >> > > > > > > > > > > have
> > >> > >> > > > > > > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > > > > > rate quota and set that to some
> > value
> > >> > like
> > >> > >> > 100
> > >> > >> > > > > > > > > requests/sec,
> > >> > >> > > > > > > > > > it
> > >> > >> > > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > > > > possible
> > >> > >> > > > > > > > > > > > > > > > > that some of the requests are
> very
> > >> > >> expensive
> > >> > >> > > > > actually
> > >> > >> > > > > > > > take
> > >> > >> > > > > > > > > a
> > >> > >> > > > > > > > > > > lot
> > >> > >> > > > > > > > > > > > of
> > >> > >> > > > > > > > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > > handle. In that case a few
> clients
> > >> may
> > >> > >> still
> > >> > >> > > > > occupy a
> > >> > >> > > > > > > lot
> > >> > >> > > > > > > > > of
> > >> > >> > > > > > > > > > > CPU
> > >> > >> > > > > > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > > > even
> > >> > >> > > > > > > > > > > > > > > > > the request rate is low. Arguably
> > we
> > >> can
> > >> > >> > > > carefully
> > >> > >> > > > > > set
> > >> > >> > > > > > > > > > request
> > >> > >> > > > > > > > > > > > rate
> > >> > >> > > > > > > > > > > > > > > quota
> > >> > >> > > > > > > > > > > > > > > > > for each request and client id
> > >> > >> combination,
> > >> > >> > but
> > >> > >> > > > it
> > >> > >> > > > > > > could
> > >> > >> > > > > > > > > > still
> > >> > >> > > > > > > > > > > be
> > >> > >> > > > > > > > > > > > > > > tricky
> > >> > >> > > > > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > > get it right for everyone.
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > If we use the request time
> handling
> > >> > >> quota, we
> > >> > >> > > can
> > >> > >> > > > > > > simply
> > >> > >> > > > > > > > > say
> > >> > >> > > > > > > > > > no
> > >> > >> > > > > > > > > > > > > > clients
> > >> > >> > > > > > > > > > > > > > > > can
> > >> > >> > > > > > > > > > > > > > > > > take up to more than 30% of the
> > total
> > >> > >> request
> > >> > >> > > > > > handling
> > >> > >> > > > > > > > > > capacity
> > >> > >> > > > > > > > > > > > > > > (measured
> > >> > >> > > > > > > > > > > > > > > > > by time), regardless of the
> > >> difference
> > >> > >> among
> > >> > >> > > > > > different
> > >> > >> > > > > > > > > > requests
> > >> > >> > > > > > > > > > > > or
> > >> > >> > > > > > > > > > > > > > what
> > >> > >> > > > > > > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > > > > > the client doing. In this case
> > maybe
> > >> we
> > >> > >> can
> > >> > >> > > quota
> > >> > >> > > > > all
> > >> > >> > > > > > > the
> > >> > >> > > > > > > > > > > > requests
> > >> > >> > > > > > > > > > > > > if
> > >> > >> > > > > > > > > > > > > > > we
> > >> > >> > > > > > > > > > > > > > > > > want to.
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > 2. The main benefit of using
> > request
> > >> > rate
> > >> > >> > limit
> > >> > >> > > > is
> > >> > >> > > > > > that
> > >> > >> > > > > > > > it
> > >> > >> > > > > > > > > > > seems
> > >> > >> > > > > > > > > > > > > more
> > >> > >> > > > > > > > > > > > > > > > > intuitive. It is true that it is
> > >> > probably
> > >> > >> > > easier
> > >> > >> > > > to
> > >> > >> > > > > > > > explain
> > >> > >> > > > > > > > > > to
> > >> > >> > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > user
> > >> > >> > > > > > > > > > > > > > > > > what does that mean. However, in
> > >> > practice
> > >> > >> it
> > >> > >> > > > looks
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > impact
> > >> > >> > > > > > > > > > > of
> > >> > >> > > > > > > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > > > > > rate quota is not more
> quantifiable
> > >> than
> > >> > >> the
> > >> > >> > > > > request
> > >> > >> > > > > > > > > handling
> > >> > >> > > > > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > > quota.
> > >> > >> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is
> > >> still
> > >> > >> > > difficult
> > >> > >> > > > > to
> > >> > >> > > > > > > > give a
> > >> > >> > > > > > > > > > > > number
> > >> > >> > > > > > > > > > > > > > > about
> > >> > >> > > > > > > > > > > > > > > > > impact of throughput or latency
> > when
> > >> a
> > >> > >> > request
> > >> > >> > > > rate
> > >> > >> > > > > > > quota
> > >> > >> > > > > > > > > is
> > >> > >> > > > > > > > > > > hit.
> > >> > >> > > > > > > > > > > > > So
> > >> > >> > > > > > > > > > > > > > it
> > >> > >> > > > > > > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > > > > > not better than the request
> > handling
> > >> > time
> > >> > >> > > quota.
> > >> > >> > > > In
> > >> > >> > > > > > > fact
> > >> > >> > > > > > > > I
> > >> > >> > > > > > > > > > feel
> > >> > >> > > > > > > > > > > > it
> > >> > >> > > > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > > > > > clearer to tell user that "you
> are
> > >> > limited
> > >> > >> > > > because
> > >> > >> > > > > > you
> > >> > >> > > > > > > > have
> > >> > >> > > > > > > > > > > taken
> > >> > >> > > > > > > > > > > > > 30%
> > >> > >> > > > > > > > > > > > > > > of
> > >> > >> > > > > > > > > > > > > > > > > the CPU time on the broker" than
> > >> > otherwise
> > >> > >> > > > > something
> > >> > >> > > > > > > like
> > >> > >> > > > > > > > > > "your
> > >> > >> > > > > > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > > > > > rate quota on metadata request
> has
> > >> > >> reached".
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > Thanks,
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM,
> > Jay
> > >> > >> Kreps <
> > >> > >> > > > > > > > > jay@confluent.io
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > I think this proposal makes a
> lot
> > >> of
> > >> > >> sense
> > >> > >> > > > > > > (especially
> > >> > >> > > > > > > > > now
> > >> > >> > > > > > > > > > > that
> > >> > >> > > > > > > > > > > > > it
> > >> > >> > > > > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > > > > > > oriented around request rate)
> and
> > >> > fills
> > >> > >> the
> > >> > >> > > > > biggest
> > >> > >> > > > > > > > > > remaining
> > >> > >> > > > > > > > > > > > gap
> > >> > >> > > > > > > > > > > > > > in
> > >> > >> > > > > > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > > > > multi-tenancy story.
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > I think for intra-cluster
> > >> > communication
> > >> > >> > > > > > (StopReplica,
> > >> > >> > > > > > > > > etc)
> > >> > >> > > > > > > > > > we
> > >> > >> > > > > > > > > > > > > could
> > >> > >> > > > > > > > > > > > > > > > avoid
> > >> > >> > > > > > > > > > > > > > > > > > throttling entirely. You can
> > >> secure or
> > >> > >> > > > otherwise
> > >> > >> > > > > > > > > lock-down
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > cluster
> > >> > >> > > > > > > > > > > > > > > > > > communication to avoid any
> > >> > unauthorized
> > >> > >> > > > external
> > >> > >> > > > > > > party
> > >> > >> > > > > > > > > from
> > >> > >> > > > > > > > > > > > > trying
> > >> > >> > > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > > > initiate these requests. As a
> > >> result
> > >> > we
> > >> > >> are
> > >> > >> > > as
> > >> > >> > > > > > likely
> > >> > >> > > > > > > > to
> > >> > >> > > > > > > > > > > cause
> > >> > >> > > > > > > > > > > > > > > problems
> > >> > >> > > > > > > > > > > > > > > > > as
> > >> > >> > > > > > > > > > > > > > > > > > solve them by throttling these,
> > >> right?
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > I'm not so sure that we should
> > >> exempt
> > >> > >> the
> > >> > >> > > > > consumer
> > >> > >> > > > > > > > > requests
> > >> > >> > > > > > > > > > > > such
> > >> > >> > > > > > > > > > > > > as
> > >> > >> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we
> > >> > >> throttle an
> > >> > >> > > > app's
> > >> > >> > > > > > > > > heartbeat
> > >> > >> > > > > > > > > > > > > > requests
> > >> > >> > > > > > > > > > > > > > > it
> > >> > >> > > > > > > > > > > > > > > > > may
> > >> > >> > > > > > > > > > > > > > > > > > cause it to fall out of its
> > >> consumer
> > >> > >> group.
> > >> > >> > > > > However
> > >> > >> > > > > > > if
> > >> > >> > > > > > > > we
> > >> > >> > > > > > > > > > > don't
> > >> > >> > > > > > > > > > > > > > > > throttle
> > >> > >> > > > > > > > > > > > > > > > > it
> > >> > >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the
> > >> > heartbeat
> > >> > >> > > > interval
> > >> > >> > > > > > is
> > >> > >> > > > > > > > set
> > >> > >> > > > > > > > > > > > > > incorrectly
> > >> > >> > > > > > > > > > > > > > > or
> > >> > >> > > > > > > > > > > > > > > > > if
> > >> > >> > > > > > > > > > > > > > > > > > some client in some language
> has
> > a
> > >> > bug.
> > >> > >> I
> > >> > >> > > think
> > >> > >> > > > > the
> > >> > >> > > > > > > > > policy
> > >> > >> > > > > > > > > > > with
> > >> > >> > > > > > > > > > > > > > this
> > >> > >> > > > > > > > > > > > > > > > kind
> > >> > >> > > > > > > > > > > > > > > > > > of throttling is to protect the
> > >> > cluster
> > >> > >> > above
> > >> > >> > > > any
> > >> > >> > > > > > > > > > individual
> > >> > >> > > > > > > > > > > > app,
> > >> > >> > > > > > > > > > > > > > > > right?
> > >> > >> > > > > > > > > > > > > > > > > I
> > >> > >> > > > > > > > > > > > > > > > > > think in general this should be
> > >> okay
> > >> > >> since
> > >> > >> > > for
> > >> > >> > > > > most
> > >> > >> > > > > > > > > > > deployments
> > >> > >> > > > > > > > > > > > > > this
> > >> > >> > > > > > > > > > > > > > > > > > setting is meant as more of a
> > >> safety
> > >> > >> > > > valve---that
> > >> > >> > > > > > is
> > >> > >> > > > > > > > > rather
> > >> > >> > > > > > > > > > > > than
> > >> > >> > > > > > > > > > > > > > set
> > >> > >> > > > > > > > > > > > > > > > > > something very close to what
> you
> > >> > expect
> > >> > >> to
> > >> > >> > > need
> > >> > >> > > > > > (say
> > >> > >> > > > > > > 2
> > >> > >> > > > > > > > > > > req/sec
> > >> > >> > > > > > > > > > > > or
> > >> > >> > > > > > > > > > > > > > > > > whatever)
> > >> > >> > > > > > > > > > > > > > > > > > you would have something quite
> > high
> > >> > >> (like
> > >> > >> > 100
> > >> > >> > > > > > > req/sec)
> > >> > >> > > > > > > > > with
> > >> > >> > > > > > > > > > > > this
> > >> > >> > > > > > > > > > > > > > > meant
> > >> > >> > > > > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I
> > >> think
> > >> > >> when
> > >> > >> > > used
> > >> > >> > > > > this
> > >> > >> > > > > > > way
> > >> > >> > > > > > > > > > > > allowing
> > >> > >> > > > > > > > > > > > > > > those
> > >> > >> > > > > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > > > be throttled would actually
> > provide
> > >> > >> > > meaningful
> > >> > >> > > > > > > > > protection.
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > -Jay
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05
> AM,
> > >> > Rajini
> > >> > >> > > > Sivaram <
> > >> > >> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > Hi all,
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > I have just created KIP-124
> to
> > >> > >> introduce
> > >> > >> > > > > request
> > >> > >> > > > > > > rate
> > >> > >> > > > > > > > > > > quotas
> > >> > >> > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > Kafka:
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
> > >> > >> > > > > > > > confluence/display/KAFKA/KIP-
> > >> > >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > The proposal is for a simple
> > >> > >> percentage
> > >> > >> > > > request
> > >> > >> > > > > > > > > handling
> > >> > >> > > > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > quota
> > >> > >> > > > > > > > > > > > > > > > > that
> > >> > >> > > > > > > > > > > > > > > > > > > can be allocated to
> > >> *<client-id>*,
> > >> > >> > *<user>*
> > >> > >> > > > or
> > >> > >> > > > > > > > *<user,
> > >> > >> > > > > > > > > > > > > > client-id>*.
> > >> > >> > > > > > > > > > > > > > > > > There
> > >> > >> > > > > > > > > > > > > > > > > > > are a few other suggestions
> > also
> > >> > under
> > >> > >> > > > > "Rejected
> > >> > >> > > > > > > > > > > > alternatives".
> > >> > >> > > > > > > > > > > > > > > > > Feedback
> > >> > >> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > Thank you...
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > Regards,
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > Rajini
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > --
> > >> > >> > > > > > > > > > > > > > > -- Guozhang
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > >
> > >> > >> > > > > > >
> > >> > >> > > > > >
> > >> > >> > > > > >
> > >> > >> > > > > >
> > >> > >> > > > > > --
> > >> > >> > > > > > -- Guozhang
> > >> > >> > > > > >
> > >> > >> > > > >
> > >> > >> > > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Colin McCabe <cm...@apache.org>.
That makes sense.  I didn't see that this field already existed in some
of the replies-- good clarification.

best,


On Wed, Mar 1, 2017, at 05:41, Rajini Sivaram wrote:
> Colin,
> 
> Thank you for the feedback. Since we are reusing the existing
> throttle_time_ms field for produce/fetch responses, changing this to
> microseconds would be a breaking change. Since we don't currently plan to
> throttle at sub-millisecond intervals, perhaps it makes sense to keep the
> value consistent with the existing responses (and metrics which expose
> this
> value) and change them all together in future if required?
> 
> Regards,
> 
> Rajini
> 
> On Tue, Feb 28, 2017 at 5:58 PM, Colin McCabe <cm...@apache.org> wrote:
> 
> > I noticed that the throttle_time_ms added to all the message responses
> > is in milliseconds.  Does it make sense to express this in microseconds
> > in case we start doing more fine-grained CPU throttling later on?  An
> > int32 should still be more than enough if using microseconds.
> >
> > best,
> > Colin
> >
> >
> > On Fri, Feb 24, 2017, at 10:31, Jun Rao wrote:
> > > Hi, Jay,
> > >
> > > 2. Regarding request.unit vs request.percentage. I started with
> > > request.percentage too. The reasoning for request.unit is the following.
> > > Suppose that the capacity has been reached on a broker and the admin
> > > needs
> > > to add a new user. A simple way to increase the capacity is to increase
> > > the
> > > number of io threads, assuming there are still enough cores. If the limit
> > > is based on percentage, the additional capacity automatically gets
> > > distributed to existing users and we haven't really carved out any
> > > additional resource for the new user. Now, is it easy for a user to
> > > reason
> > > about 0.1 unit vs 10%. My feeling is that both are hard and have to be
> > > configured empirically. Not sure if percentage is obviously easier to
> > > reason about.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io> wrote:
> > >
> > > > A couple of quick points:
> > > >
> > > > 1. Even though the implementation of this quota is only using io thread
> > > > time, i think we should call it something like "request-time". This
> > will
> > > > give us flexibility to improve the implementation to cover network
> > threads
> > > > in the future and will avoid exposing internal details like our thread
> > > > pools on the server.
> > > >
> > > > 2. Jun/Roger, I get what you are trying to fix but the idea of
> > thread/units
> > > > is super unintuitive as a user-facing knob. I had to read the KIP like
> > > > eight times to understand this. I'm not sure that your point that
> > > > increasing the number of threads is a problem with a percentage-based
> > > > value, it really depends on whether the user thinks about the
> > "percentage
> > > > of request processing time" or "thread units". If they think "I have
> > > > allocated 10% of my request processing time to user x" then it is a bug
> > > > that increasing the thread count decreases that percent as it does in
> > the
> > > > current proposal. As a practical matter I think the only way to
> > actually
> > > > reason about this is as a percent---I just don't believe people are
> > going
> > > > to think, "ah, 4.3 thread units, that is the right amount!". Instead I
> > > > think they have to understand this thread unit concept, figure out what
> > > > they have set in number of threads, compute a percent and then come up
> > with
> > > > the number of thread units, and these will all be wrong if that thread
> > > > count changes. I also think this ties us to throttling the I/O thread
> > pool,
> > > > which may not be where we want to end up.
> > > >
> > > > 3. For what it's worth I do think having a single throttle_ms field in
> > all
> > > > the responses that combines all throttling from all quotas is probably
> > the
> > > > simplest. There could be a use case for having separate fields for
> > each,
> > > > but I think that is actually harder to use/monitor in the common case
> > so
> > > > unless someone has a use case I think just one should be fine.
> > > >
> > > > -Jay
> > > >
> > > > On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > rajinisivaram@gmail.com>
> > > > wrote:
> > > >
> > > > > I have updated the KIP based on the discussions so far.
> > > > >
> > > > >
> > > > > Regards,
> > > > >
> > > > > Rajini
> > > > >
> > > > > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Thank you all for the feedback.
> > > > > >
> > > > > > Ismael #1. It makes sense not to throttle inter-broker requests
> > like
> > > > > > LeaderAndIsr etc. The simplest way to ensure that clients cannot
> > use
> > > > > these
> > > > > > requests to bypass quotas for DoS attacks is to ensure that ACLs
> > > > prevent
> > > > > > clients from using these requests and unauthorized requests are
> > > > included
> > > > > > towards quotas.
> > > > > >
> > > > > > Ismael #2, Jay #1 : I was thinking that these quotas can return a
> > > > > separate
> > > > > > throttle time, and all utilization based quotas could use the same
> > > > field
> > > > > > (we won't add another one for network thread utilization for
> > instance).
> > > > > But
> > > > > > perhaps it makes sense to keep byte rate quotas separate in
> > > > produce/fetch
> > > > > > responses to provide separate metrics? Agree with Ismael that the
> > name
> > > > of
> > > > > > the existing field should be changed if we have two. Happy to
> > switch
> > > > to a
> > > > > > single combined throttle time if that is sufficient.
> > > > > >
> > > > > > Ismael #4, #5, #6: Will update KIP. Will use dot separated name
> > for new
> > > > > > property. Replication quotas use dot separated, so it will be
> > > > consistent
> > > > > > with all properties except byte rate quotas.
> > > > > >
> > > > > > Radai: #1 Request processing time rather than request rate were
> > chosen
> > > > > > because the time per request can vary significantly between
> > requests as
> > > > > > mentioned in the discussion and KIP.
> > > > > > #2 Two separate quotas for heartbeats/regular requests feel like
> > more
> > > > > > configuration and more metrics. Since most users would set quotas
> > > > higher
> > > > > > than the expected usage and quotas are more of a safety net, a
> > single
> > > > > quota
> > > > > > should work in most cases.
> > > > > >  #3 The number of requests in purgatory is limited by the number of
> > > > > active
> > > > > > connections since only one request per connection will be
> > throttled at
> > > > a
> > > > > > time.
> > > > > > #4 As with byte rate quotas, to use the full allocated quotas,
> > > > > > clients/users would need to use partitions that are distributed
> > across
> > > > > the
> > > > > > cluster. The alternative of using cluster-wide quotas instead of
> > > > > per-broker
> > > > > > quotas would be far too complex to implement.
> > > > > >
> > > > > > Dong : We currently have two ClientQuotaManagers for quota types
> > Fetch
> > > > > and
> > > > > > Produce. A new one will be added for IOThread, which manages
> > quotas for
> > > > > I/O
> > > > > > thread utilization. This will not update the Fetch or Produce
> > > > queue-size,
> > > > > > but will have a separate metric for the queue-size.  I wasn't
> > planning
> > > > to
> > > > > > add any additional metrics apart from the equivalent ones for
> > existing
> > > > > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
> > > > utilization
> > > > > > could be slightly misleading since it depends on the sequence of
> > > > > requests.
> > > > > > But we can look into more metrics after the KIP is implemented if
> > > > > required.
> > > > > >
> > > > > > I think we need to limit the maximum delay since all requests are
> > > > > > throttled. If a client has a quota of 0.001 units and a single
> > request
> > > > > used
> > > > > > 50ms, we don't want to delay all requests from the client by 50
> > > > seconds,
> > > > > > throwing the client out of all its consumer groups. The issue is
> > only
> > > > if
> > > > > a
> > > > > > user is allocated a quota that is insufficient to process one large
> > > > > > request. The expectation is that the units allocated per user will
> > be
> > > > > much
> > > > > > higher than the time taken to process one request and the limit
> > should
> > > > > > seldom be applied. Agree this needs proper documentation.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Rajini
> > > > > >
> > > > > >
> > > > > > On Thu, Feb 23, 2017 at 8:04 PM, radai <radai.rosenblatt@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > >> @jun: i wasnt concerned about tying up a request processing
> > thread,
> > > > but
> > > > > >> IIUC the code does still read the entire request out, which might
> > > > add-up
> > > > > >> to
> > > > > >> a non-negligible amount of memory.
> > > > > >>
> > > > > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com>
> > > > wrote:
> > > > > >>
> > > > > >> > Hey Rajini,
> > > > > >> >
> > > > > >> > The current KIP says that the maximum delay will be reduced to
> > > > window
> > > > > >> size
> > > > > >> > if it is larger than the window size. I have a concern with
> > this:
> > > > > >> >
> > > > > >> > 1) This essentially means that the user is allowed to exceed
> > their
> > > > > quota
> > > > > >> > over a long period of time. Can you provide an upper bound on
> > this
> > > > > >> > deviation?
> > > > > >> >
> > > > > >> > 2) What is the motivation for cap the maximum delay by the
> > window
> > > > > size?
> > > > > >> I
> > > > > >> > am wondering if there is better alternative to address the
> > problem.
> > > > > >> >
> > > > > >> > 3) It means that the existing metric-related config will have a
> > more
> > > > > >> > directly impact on the mechanism of this io-thread-unit-based
> > quota.
> > > > > The
> > > > > >> > may be an important change depending on the answer to 1) above.
> > We
> > > > > >> probably
> > > > > >> > need to document this more explicitly.
> > > > > >> >
> > > > > >> > Dong
> > > > > >> >
> > > > > >> >
> > > > > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <lindong28@gmail.com
> > >
> > > > > wrote:
> > > > > >> >
> > > > > >> > > Hey Jun,
> > > > > >> > >
> > > > > >> > > Yeah you are right. I thought it wasn't because at LinkedIn it
> > > > will
> > > > > be
> > > > > >> > too
> > > > > >> > > much pressure on inGraph to expose those per-clientId metrics
> > so
> > > > we
> > > > > >> ended
> > > > > >> > > up printing them periodically to local log. Never mind if it
> > is
> > > > not
> > > > > a
> > > > > >> > > general problem.
> > > > > >> > >
> > > > > >> > > Hey Rajini,
> > > > > >> > >
> > > > > >> > > - I agree with Jay that we probably don't want to add a new
> > field
> > > > > for
> > > > > >> > > every quota ProduceResponse or FetchResponse. Is there any
> > > > use-case
> > > > > >> for
> > > > > >> > > having separate throttle-time fields for byte-rate-quota and
> > > > > >> > > io-thread-unit-quota? You probably need to document this as
> > > > > interface
> > > > > >> > > change if you plan to add new field in any request.
> > > > > >> > >
> > > > > >> > > - I don't think IOThread belongs to quotaType. The existing
> > quota
> > > > > >> types
> > > > > >> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication)
> > > > identify
> > > > > >> the
> > > > > >> > > type of request that are throttled, not the quota mechanism
> > that
> > > > is
> > > > > >> > applied.
> > > > > >> > >
> > > > > >> > > - If a request is throttled due to this io-thread-unit-based
> > > > quota,
> > > > > is
> > > > > >> > the
> > > > > >> > > existing queue-size metric in ClientQuotaManager incremented?
> > > > > >> > >
> > > > > >> > > - In the interest of providing guide line for admin to decide
> > > > > >> > > io-thread-unit-based quota and for user to understand its
> > impact
> > > > on
> > > > > >> their
> > > > > >> > > traffic, would it be useful to have a metric that shows the
> > > > overall
> > > > > >> > > byte-rate per io-thread-unit? Can we also show this a
> > per-clientId
> > > > > >> > metric?
> > > > > >> > >
> > > > > >> > > Thanks,
> > > > > >> > > Dong
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io>
> > > > wrote:
> > > > > >> > >
> > > > > >> > >> Hi, Ismael,
> > > > > >> > >>
> > > > > >> > >> For #3, typically, an admin won't configure more io threads
> > than
> > > > > CPU
> > > > > >> > >> cores,
> > > > > >> > >> but it's possible for an admin to start with fewer io threads
> > > > than
> > > > > >> cores
> > > > > >> > >> and grow that later on.
> > > > > >> > >>
> > > > > >> > >> Hi, Dong,
> > > > > >> > >>
> > > > > >> > >> I think the throttleTime sensor on the broker tells the admin
> > > > > >> whether a
> > > > > >> > >> user/clentId is throttled or not.
> > > > > >> > >>
> > > > > >> > >> Hi, Radi,
> > > > > >> > >>
> > > > > >> > >> The reasoning for delaying the throttled requests on the
> > broker
> > > > > >> instead
> > > > > >> > of
> > > > > >> > >> returning an error immediately is that the latter has no way
> > to
> > > > > >> prevent
> > > > > >> > >> the
> > > > > >> > >> client from retrying immediately, which will make things
> > worse.
> > > > The
> > > > > >> > >> delaying logic is based off a delay queue. A separate
> > expiration
> > > > > >> thread
> > > > > >> > >> just waits on the next to be expired request. So, it doesn't
> > tie
> > > > > up a
> > > > > >> > >> request handler thread.
> > > > > >> > >>
> > > > > >> > >> Thanks,
> > > > > >> > >>
> > > > > >> > >> Jun
> > > > > >> > >>
> > > > > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> > ismael@juma.me.uk>
> > > > > >> wrote:
> > > > > >> > >>
> > > > > >> > >> > Hi Jay,
> > > > > >> > >> >
> > > > > >> > >> > Regarding 1, I definitely like the simplicity of keeping a
> > > > single
> > > > > >> > >> throttle
> > > > > >> > >> > time field in the response. The downside is that the client
> > > > > metrics
> > > > > >> > >> will be
> > > > > >> > >> > more coarse grained.
> > > > > >> > >> >
> > > > > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.
> > percentage`
> > > > > and
> > > > > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > >> > >> >
> > > > > >> > >> > Ismael
> > > > > >> > >> >
> > > > > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> > jay@confluent.io>
> > > > > >> wrote:
> > > > > >> > >> >
> > > > > >> > >> > > A few minor comments:
> > > > > >> > >> > >
> > > > > >> > >> > >    1. Isn't it the case that the throttling time response
> > > > field
> > > > > >> > should
> > > > > >> > >> > have
> > > > > >> > >> > >    the total time your request was throttled
> > irrespective of
> > > > > the
> > > > > >> > >> quotas
> > > > > >> > >> > > that
> > > > > >> > >> > >    caused that. Limiting it to byte rate quota doesn't
> > make
> > > > > >> sense,
> > > > > >> > >> but I
> > > > > >> > >> > > also
> > > > > >> > >> > >    I don't think we want to end up adding new fields in
> > the
> > > > > >> response
> > > > > >> > >> for
> > > > > >> > >> > > every
> > > > > >> > >> > >    single thing we quota, right?
> > > > > >> > >> > >    2. I don't think we should make this quota
> > specifically
> > > > > about
> > > > > >> io
> > > > > >> > >> > >    threads. Once we introduce these quotas people set
> > them
> > > > and
> > > > > >> > expect
> > > > > >> > >> > them
> > > > > >> > >> > > to
> > > > > >> > >> > >    be enforced (and if they aren't it may cause an
> > outage).
> > > > As
> > > > > a
> > > > > >> > >> result
> > > > > >> > >> > > they
> > > > > >> > >> > >    are a bit more sensitive than normal configs, I
> > think. The
> > > > > >> > current
> > > > > >> > >> > > thread
> > > > > >> > >> > >    pools seem like something of an implementation detail
> > and
> > > > > not
> > > > > >> the
> > > > > >> > >> > level
> > > > > >> > >> > > the
> > > > > >> > >> > >    user-facing quotas should be involved with. I think it
> > > > might
> > > > > >> be
> > > > > >> > >> better
> > > > > >> > >> > > to
> > > > > >> > >> > >    make this a general request-time throttle with no
> > mention
> > > > in
> > > > > >> the
> > > > > >> > >> > naming
> > > > > >> > >> > >    about I/O threads and simply acknowledge the current
> > > > > >> limitation
> > > > > >> > >> (which
> > > > > >> > >> > > we
> > > > > >> > >> > >    may someday fix) in the docs that this covers only the
> > > > time
> > > > > >> after
> > > > > >> > >> the
> > > > > >> > >> > >    thread is read off the network.
> > > > > >> > >> > >    3. As such I think the right interface to the user
> > would
> > > > be
> > > > > >> > >> something
> > > > > >> > >> > >    like percent_request_time and be in {0,...100} or
> > > > > >> > >> request_time_ratio
> > > > > >> > >> > > and be
> > > > > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology
> > we
> > > > used
> > > > > >> if
> > > > > >> > the
> > > > > >> > >> > > scale
> > > > > >> > >> > >    is between 0 and 1 in the other metrics, right?)
> > > > > >> > >> > >
> > > > > >> > >> > > -Jay
> > > > > >> > >> > >
> > > > > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > > > > >> > >> rajinisivaram@gmail.com
> > > > > >> > >> > >
> > > > > >> > >> > > wrote:
> > > > > >> > >> > >
> > > > > >> > >> > > > Guozhang/Dong,
> > > > > >> > >> > > >
> > > > > >> > >> > > > Thank you for the feedback.
> > > > > >> > >> > > >
> > > > > >> > >> > > > Guozhang : I have updated the section on co-existence
> > of
> > > > byte
> > > > > >> rate
> > > > > >> > >> and
> > > > > >> > >> > > > request time quotas.
> > > > > >> > >> > > >
> > > > > >> > >> > > > Dong: I hadn't added much detail to the metrics and
> > sensors
> > > > > >> since
> > > > > >> > >> they
> > > > > >> > >> > > are
> > > > > >> > >> > > > going to be very similar to the existing metrics and
> > > > sensors.
> > > > > >> To
> > > > > >> > >> avoid
> > > > > >> > >> > > > confusion, I have now added more detail. All metrics
> > are in
> > > > > the
> > > > > >> > >> group
> > > > > >> > >> > > > "quotaType" and all sensors have names starting with
> > > > > >> "quotaType"
> > > > > >> > >> (where
> > > > > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > > > > >> > >> > > > FollowerReplication/*IOThread*).
> > > > > >> > >> > > > So there will be no reuse of existing metrics/sensors.
> > The
> > > > > new
> > > > > >> > ones
> > > > > >> > >> for
> > > > > >> > >> > > > request processing time based throttling will be
> > completely
> > > > > >> > >> independent
> > > > > >> > >> > > of
> > > > > >> > >> > > > existing metrics/sensors, but will be consistent in
> > format.
> > > > > >> > >> > > >
> > > > > >> > >> > > > The existing throttle_time_ms field in produce/fetch
> > > > > responses
> > > > > >> > will
> > > > > >> > >> not
> > > > > >> > >> > > be
> > > > > >> > >> > > > impacted by this KIP. That will continue to return
> > > > byte-rate
> > > > > >> based
> > > > > >> > >> > > > throttling times. In addition, a new field
> > > > > >> > request_throttle_time_ms
> > > > > >> > >> > will
> > > > > >> > >> > > be
> > > > > >> > >> > > > added to return request quota based throttling times.
> > These
> > > > > >> will
> > > > > >> > be
> > > > > >> > >> > > exposed
> > > > > >> > >> > > > as new metrics on the client-side.
> > > > > >> > >> > > >
> > > > > >> > >> > > > Since all metrics and sensors are different for each
> > type
> > > > of
> > > > > >> > quota,
> > > > > >> > >> I
> > > > > >> > >> > > > believe there is already sufficient metrics to monitor
> > > > > >> throttling
> > > > > >> > on
> > > > > >> > >> > both
> > > > > >> > >> > > > client and broker side for each type of throttling.
> > > > > >> > >> > > >
> > > > > >> > >> > > > Regards,
> > > > > >> > >> > > >
> > > > > >> > >> > > > Rajini
> > > > > >> > >> > > >
> > > > > >> > >> > > >
> > > > > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > > > > lindong28@gmail.com
> > > > > >> >
> > > > > >> > >> wrote:
> > > > > >> > >> > > >
> > > > > >> > >> > > > > Hey Rajini,
> > > > > >> > >> > > > >
> > > > > >> > >> > > > > I think it makes a lot of sense to use
> > io_thread_units as
> > > > > >> metric
> > > > > >> > >> to
> > > > > >> > >> > > quota
> > > > > >> > >> > > > > user's traffic here. LGTM overall. I have some
> > questions
> > > > > >> > regarding
> > > > > >> > >> > > > sensors.
> > > > > >> > >> > > > >
> > > > > >> > >> > > > > - Can you be more specific in the KIP what sensors
> > will
> > > > be
> > > > > >> > added?
> > > > > >> > >> For
> > > > > >> > >> > > > > example, it will be useful to specify the name and
> > > > > >> attributes of
> > > > > >> > >> > these
> > > > > >> > >> > > > new
> > > > > >> > >> > > > > sensors.
> > > > > >> > >> > > > >
> > > > > >> > >> > > > > - We currently have throttle-time and queue-size for
> > > > > >> byte-rate
> > > > > >> > >> based
> > > > > >> > >> > > > quota.
> > > > > >> > >> > > > > Are you going to have separate throttle-time and
> > > > queue-size
> > > > > >> for
> > > > > >> > >> > > requests
> > > > > >> > >> > > > > throttled by io_thread_unit-based quota, or will they
> > > > share
> > > > > >> the
> > > > > >> > >> same
> > > > > >> > >> > > > > sensor?
> > > > > >> > >> > > > >
> > > > > >> > >> > > > > - Does the throttle-time in the ProduceResponse and
> > > > > >> > FetchResponse
> > > > > >> > >> > > > contains
> > > > > >> > >> > > > > time due to io_thread_unit-based quota?
> > > > > >> > >> > > > >
> > > > > >> > >> > > > > - Currently kafka server doesn't not provide any log
> > or
> > > > > >> metrics
> > > > > >> > >> that
> > > > > >> > >> > > > tells
> > > > > >> > >> > > > > whether any given clientId (or user) is throttled.
> > This
> > > > is
> > > > > >> not
> > > > > >> > too
> > > > > >> > >> > bad
> > > > > >> > >> > > > > because we can still check the client-side byte-rate
> > > > metric
> > > > > >> to
> > > > > >> > >> > validate
> > > > > >> > >> > > > > whether a given client is throttled. But with this
> > > > > >> > io_thread_unit,
> > > > > >> > >> > > there
> > > > > >> > >> > > > > will be no way to validate whether a given client is
> > slow
> > > > > >> > because
> > > > > >> > >> it
> > > > > >> > >> > > has
> > > > > >> > >> > > > > exceeded its io_thread_unit limit. It is necessary
> > for
> > > > user
> > > > > >> to
> > > > > >> > be
> > > > > >> > >> > able
> > > > > >> > >> > > to
> > > > > >> > >> > > > > know this information to figure how whether they have
> > > > > reached
> > > > > >> > >> there
> > > > > >> > >> > > quota
> > > > > >> > >> > > > > limit. How about we add log4j log on the server side
> > to
> > > > > >> > >> periodically
> > > > > >> > >> > > > print
> > > > > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > > > > >> > >> > io-thread-unit-throttle-time)
> > > > > >> > >> > > so
> > > > > >> > >> > > > > that kafka administrator can figure those users that
> > have
> > > > > >> > reached
> > > > > >> > >> > their
> > > > > >> > >> > > > > limit and act accordingly?
> > > > > >> > >> > > > >
> > > > > >> > >> > > > > Thanks,
> > > > > >> > >> > > > > Dong
> > > > > >> > >> > > > >
> > > > > >> > >> > > > >
> > > > > >> > >> > > > >
> > > > > >> > >> > > > >
> > > > > >> > >> > > > >
> > > > > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> > > > > >> > >> wangguoz@gmail.com>
> > > > > >> > >> > > > wrote:
> > > > > >> > >> > > > >
> > > > > >> > >> > > > > > Made a pass over the doc, overall LGTM except a
> > minor
> > > > > >> comment
> > > > > >> > on
> > > > > >> > >> > the
> > > > > >> > >> > > > > > throttling implementation:
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > > > Stated as "Request processing time throttling will
> > be
> > > > > >> applied
> > > > > >> > on
> > > > > >> > >> > top
> > > > > >> > >> > > if
> > > > > >> > >> > > > > > necessary." I thought that it meant the request
> > > > > processing
> > > > > >> > time
> > > > > >> > >> > > > > throttling
> > > > > >> > >> > > > > > is applied first, but continue reading I found it
> > > > > actually
> > > > > >> > >> meant to
> > > > > >> > >> > > > apply
> > > > > >> > >> > > > > > produce / fetch byte rate throttling first.
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > > > Also the last sentence "The remaining delay if any
> > is
> > > > > >> applied
> > > > > >> > to
> > > > > >> > >> > the
> > > > > >> > >> > > > > > response." is a bit confusing to me. Maybe
> > rewording
> > > > it a
> > > > > >> bit?
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > > > Guozhang
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> > > > > jun@confluent.io
> > > > > >> >
> > > > > >> > >> wrote:
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > > > > Hi, Rajini,
> > > > > >> > >> > > > > > >
> > > > > >> > >> > > > > > > Thanks for the updated KIP. The latest proposal
> > looks
> > > > > >> good
> > > > > >> > to
> > > > > >> > >> me.
> > > > > >> > >> > > > > > >
> > > > > >> > >> > > > > > > Jun
> > > > > >> > >> > > > > > >
> > > > > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > > > > >> > >> > > > > rajinisivaram@gmail.com
> > > > > >> > >> > > > > > >
> > > > > >> > >> > > > > > > wrote:
> > > > > >> > >> > > > > > >
> > > > > >> > >> > > > > > > > Jun/Roger,
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > > > Thank you for the feedback.
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > > > 1. I have updated the KIP to use absolute units
> > > > > >> instead of
> > > > > >> > >> > > > > percentage.
> > > > > >> > >> > > > > > > The
> > > > > >> > >> > > > > > > > property is called* io_thread_units* to align
> > with
> > > > > the
> > > > > >> > >> thread
> > > > > >> > >> > > count
> > > > > >> > >> > > > > > > > property *num.io.threads*. When we implement
> > > > network
> > > > > >> > thread
> > > > > >> > >> > > > > utilization
> > > > > >> > >> > > > > > > > quotas, we can add another property
> > > > > >> > *network_thread_units.*
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > > > 2. ControlledShutdown is already listed under
> > the
> > > > > >> exempt
> > > > > >> > >> > > requests.
> > > > > >> > >> > > > > Jun,
> > > > > >> > >> > > > > > > did
> > > > > >> > >> > > > > > > > you mean a different request that needs to be
> > > > added?
> > > > > >> The
> > > > > >> > >> four
> > > > > >> > >> > > > > requests
> > > > > >> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> > > > > >> > >> > ControlledShutdown,
> > > > > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are
> > > > controlled
> > > > > >> > using
> > > > > >> > >> > > > > > ClusterAction
> > > > > >> > >> > > > > > > > ACL, so it is easy to exclude and only
> > throttle if
> > > > > >> > >> > unauthorized.
> > > > > >> > >> > > I
> > > > > >> > >> > > > > > wasn't
> > > > > >> > >> > > > > > > > sure if there are other requests used only for
> > > > > >> > inter-broker
> > > > > >> > >> > that
> > > > > >> > >> > > > > needed
> > > > > >> > >> > > > > > > to
> > > > > >> > >> > > > > > > > be excluded.
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > > > 3. I was thinking the smallest change would be
> > to
> > > > > >> replace
> > > > > >> > >> all
> > > > > >> > >> > > > > > references
> > > > > >> > >> > > > > > > to
> > > > > >> > >> > > > > > > > *requestChannel.sendResponse()* with a local
> > > > method
> > > > > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the
> > > > > throttling
> > > > > >> if
> > > > > >> > >> any
> > > > > >> > >> > > plus
> > > > > >> > >> > > > > send
> > > > > >> > >> > > > > > > > response. If we throttle first in
> > > > > *KafkaApis.handle()*,
> > > > > >> > the
> > > > > >> > >> > time
> > > > > >> > >> > > > > spent
> > > > > >> > >> > > > > > > > within the method handling the request will
> > not be
> > > > > >> > recorded
> > > > > >> > >> or
> > > > > >> > >> > > used
> > > > > >> > >> > > > > in
> > > > > >> > >> > > > > > > > throttling. We can look into this again when
> > the PR
> > > > > is
> > > > > >> > ready
> > > > > >> > >> > for
> > > > > >> > >> > > > > > review.
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > > > Regards,
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > > > Rajini
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > > > > >> > >> > > > > roger.hoover@gmail.com>
> > > > > >> > >> > > > > > > > wrote:
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > > > > Great to see this KIP and the excellent
> > > > discussion.
> > > > > >> > >> > > > > > > > >
> > > > > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
> > > > > >> application
> > > > > >> > is
> > > > > >> > >> > > > > allocated
> > > > > >> > >> > > > > > 1
> > > > > >> > >> > > > > > > > > request handler unit, then it's as if I have
> > a
> > > > > Kafka
> > > > > >> > >> broker
> > > > > >> > >> > > with
> > > > > >> > >> > > > a
> > > > > >> > >> > > > > > > single
> > > > > >> > >> > > > > > > > > request handler thread dedicated to me.
> > That's
> > > > the
> > > > > >> > most I
> > > > > >> > >> > can
> > > > > >> > >> > > > use,
> > > > > >> > >> > > > > > at
> > > > > >> > >> > > > > > > > > least.  That allocation doesn't change even
> > if an
> > > > > >> admin
> > > > > >> > >> later
> > > > > >> > >> > > > > > increases
> > > > > >> > >> > > > > > > > the
> > > > > >> > >> > > > > > > > > size of the request thread pool on the
> > broker.
> > > > > It's
> > > > > >> > >> similar
> > > > > >> > >> > to
> > > > > >> > >> > > > the
> > > > > >> > >> > > > > > CPU
> > > > > >> > >> > > > > > > > > abstraction that VMs and containers get from
> > > > > >> hypervisors
> > > > > >> > >> or
> > > > > >> > >> > OS
> > > > > >> > >> > > > > > > > schedulers.
> > > > > >> > >> > > > > > > > > While different client access patterns can
> > use
> > > > > wildly
> > > > > >> > >> > different
> > > > > >> > >> > > > > > amounts
> > > > > >> > >> > > > > > > > of
> > > > > >> > >> > > > > > > > > request thread resources per request, a given
> > > > > >> > application
> > > > > >> > >> > will
> > > > > >> > >> > > > > > > generally
> > > > > >> > >> > > > > > > > > have a stable access pattern and can figure
> > out
> > > > > >> > >> empirically
> > > > > >> > >> > how
> > > > > >> > >> > > > > many
> > > > > >> > >> > > > > > > > > "request thread units" it needs to meet it's
> > > > > >> > >> > throughput/latency
> > > > > >> > >> > > > > > goals.
> > > > > >> > >> > > > > > > > >
> > > > > >> > >> > > > > > > > > Cheers,
> > > > > >> > >> > > > > > > > >
> > > > > >> > >> > > > > > > > > Roger
> > > > > >> > >> > > > > > > > >
> > > > > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> > > > > >> > >> jun@confluent.io>
> > > > > >> > >> > > > wrote:
> > > > > >> > >> > > > > > > > >
> > > > > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > >> > >> > > > > > > > > >
> > > > > >> > >> > > > > > > > > > Thanks for the updated KIP. A few more
> > > > comments.
> > > > > >> > >> > > > > > > > > >
> > > > > >> > >> > > > > > > > > > 1. A concern of request_time_percent is
> > that
> > > > it's
> > > > > >> not
> > > > > >> > an
> > > > > >> > >> > > > absolute
> > > > > >> > >> > > > > > > > value.
> > > > > >> > >> > > > > > > > > > Let's say you give a user a 10% limit. If
> > the
> > > > > admin
> > > > > >> > >> doubles
> > > > > >> > >> > > the
> > > > > >> > >> > > > > > > number
> > > > > >> > >> > > > > > > > of
> > > > > >> > >> > > > > > > > > > request handler threads, that user now
> > actually
> > > > > has
> > > > > >> > >> twice
> > > > > >> > >> > the
> > > > > >> > >> > > > > > > absolute
> > > > > >> > >> > > > > > > > > > capacity. This may confuse people a bit.
> > So,
> > > > > >> perhaps
> > > > > >> > >> > setting
> > > > > >> > >> > > > the
> > > > > >> > >> > > > > > > quota
> > > > > >> > >> > > > > > > > > > based on an absolute request thread unit is
> > > > > better.
> > > > > >> > >> > > > > > > > > >
> > > > > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
> > > > > >> inter-broker
> > > > > >> > >> > request
> > > > > >> > >> > > > and
> > > > > >> > >> > > > > > > needs
> > > > > >> > >> > > > > > > > to
> > > > > >> > >> > > > > > > > > > be excluded from throttling.
> > > > > >> > >> > > > > > > > > >
> > > > > >> > >> > > > > > > > > > 3. Implementation wise, I am wondering if
> > it's
> > > > > >> simpler
> > > > > >> > >> to
> > > > > >> > >> > > apply
> > > > > >> > >> > > > > the
> > > > > >> > >> > > > > > > > > request
> > > > > >> > >> > > > > > > > > > time throttling first in
> > KafkaApis.handle().
> > > > > >> > Otherwise,
> > > > > >> > >> we
> > > > > >> > >> > > will
> > > > > >> > >> > > > > > need
> > > > > >> > >> > > > > > > to
> > > > > >> > >> > > > > > > > > add
> > > > > >> > >> > > > > > > > > > the throttling logic in each type of
> > request.
> > > > > >> > >> > > > > > > > > >
> > > > > >> > >> > > > > > > > > > Thanks,
> > > > > >> > >> > > > > > > > > >
> > > > > >> > >> > > > > > > > > > Jun
> > > > > >> > >> > > > > > > > > >
> > > > > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini
> > > > Sivaram <
> > > > > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > >> > >> > > > > > > > > >
> > > > > >> > >> > > > > > > > > > wrote:
> > > > > >> > >> > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > Jun,
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > Thank you for the review.
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > I have reverted to the original KIP that
> > > > > >> throttles
> > > > > >> > >> based
> > > > > >> > >> > on
> > > > > >> > >> > > > > > request
> > > > > >> > >> > > > > > > > > > handler
> > > > > >> > >> > > > > > > > > > > utilization. At the moment, it uses
> > > > percentage,
> > > > > >> but
> > > > > >> > I
> > > > > >> > >> am
> > > > > >> > >> > > > happy
> > > > > >> > >> > > > > to
> > > > > >> > >> > > > > > > > > change
> > > > > >> > >> > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
> > > > > >> required. I
> > > > > >> > >> have
> > > > > >> > >> > > > added
> > > > > >> > >> > > > > > the
> > > > > >> > >> > > > > > > > > > examples
> > > > > >> > >> > > > > > > > > > > from this discussion to the KIP. Also
> > added a
> > > > > >> > "Future
> > > > > >> > >> > Work"
> > > > > >> > >> > > > > > section
> > > > > >> > >> > > > > > > > to
> > > > > >> > >> > > > > > > > > > > address network thread utilization. The
> > > > > >> > configuration
> > > > > >> > >> is
> > > > > >> > >> > > > named
> > > > > >> > >> > > > > > > > > > > "request_time_percent" with the
> > expectation
> > > > > that
> > > > > >> it
> > > > > >> > >> can
> > > > > >> > >> > > also
> > > > > >> > >> > > > be
> > > > > >> > >> > > > > > > used
> > > > > >> > >> > > > > > > > as
> > > > > >> > >> > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > limit for network thread utilization when
> > > > that
> > > > > is
> > > > > >> > >> > > > implemented,
> > > > > >> > >> > > > > so
> > > > > >> > >> > > > > > > > that
> > > > > >> > >> > > > > > > > > > > users have to set only one config for
> > the two
> > > > > and
> > > > > >> > not
> > > > > >> > >> > have
> > > > > >> > >> > > to
> > > > > >> > >> > > > > > worry
> > > > > >> > >> > > > > > > > > about
> > > > > >> > >> > > > > > > > > > > the internal distribution of the work
> > between
> > > > > the
> > > > > >> > two
> > > > > >> > >> > > thread
> > > > > >> > >> > > > > > pools
> > > > > >> > >> > > > > > > in
> > > > > >> > >> > > > > > > > > > > Kafka.
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > Regards,
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > Rajini
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun
> > Rao <
> > > > > >> > >> > > jun@confluent.io>
> > > > > >> > >> > > > > > > wrote:
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > The benefit of using the request
> > processing
> > > > > >> time
> > > > > >> > >> over
> > > > > >> > >> > the
> > > > > >> > >> > > > > > request
> > > > > >> > >> > > > > > > > > rate
> > > > > >> > >> > > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > > exactly what people have said. I will
> > just
> > > > > >> expand
> > > > > >> > >> that
> > > > > >> > >> > a
> > > > > >> > >> > > > bit.
> > > > > >> > >> > > > > > > > > Consider
> > > > > >> > >> > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > following case. The producer sends a
> > > > produce
> > > > > >> > request
> > > > > >> > >> > > with a
> > > > > >> > >> > > > > > 10MB
> > > > > >> > >> > > > > > > > > > message
> > > > > >> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> > > > > >> > >> decompression of
> > > > > >> > >> > > the
> > > > > >> > >> > > > > > > message
> > > > > >> > >> > > > > > > > > on
> > > > > >> > >> > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > broker could take 10-15 seconds, during
> > > > which
> > > > > >> > time,
> > > > > >> > >> a
> > > > > >> > >> > > > request
> > > > > >> > >> > > > > > > > handler
> > > > > >> > >> > > > > > > > > > > > thread is completely blocked. In this
> > case,
> > > > > >> > neither
> > > > > >> > >> the
> > > > > >> > >> > > > > byte-in
> > > > > >> > >> > > > > > > > quota
> > > > > >> > >> > > > > > > > > > nor
> > > > > >> > >> > > > > > > > > > > > the request rate quota may be
> > effective in
> > > > > >> > >> protecting
> > > > > >> > >> > the
> > > > > >> > >> > > > > > broker.
> > > > > >> > >> > > > > > > > > > > Consider
> > > > > >> > >> > > > > > > > > > > > another case. A consumer group starts
> > with
> > > > 10
> > > > > >> > >> instances
> > > > > >> > >> > > and
> > > > > >> > >> > > > > > later
> > > > > >> > >> > > > > > > > on
> > > > > >> > >> > > > > > > > > > > > switches to 20 instances. The request
> > rate
> > > > > will
> > > > > >> > >> likely
> > > > > >> > >> > > > > double,
> > > > > >> > >> > > > > > > but
> > > > > >> > >> > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > actually load on the broker may not
> > double
> > > > > >> since
> > > > > >> > >> each
> > > > > >> > >> > > fetch
> > > > > >> > >> > > > > > > request
> > > > > >> > >> > > > > > > > > > only
> > > > > >> > >> > > > > > > > > > > > contains half of the partitions.
> > Request
> > > > rate
> > > > > >> > quota
> > > > > >> > >> may
> > > > > >> > >> > > not
> > > > > >> > >> > > > > be
> > > > > >> > >> > > > > > > easy
> > > > > >> > >> > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > configure in this case.
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > What we really want is to be able to
> > > > prevent
> > > > > a
> > > > > >> > >> client
> > > > > >> > >> > > from
> > > > > >> > >> > > > > > using
> > > > > >> > >> > > > > > > > too
> > > > > >> > >> > > > > > > > > > much
> > > > > >> > >> > > > > > > > > > > > of the server side resources. In this
> > > > > >> particular
> > > > > >> > >> KIP,
> > > > > >> > >> > > this
> > > > > >> > >> > > > > > > resource
> > > > > >> > >> > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > capacity of the request handler
> > threads. I
> > > > > >> agree
> > > > > >> > >> that
> > > > > >> > >> > it
> > > > > >> > >> > > > may
> > > > > >> > >> > > > > > not
> > > > > >> > >> > > > > > > be
> > > > > >> > >> > > > > > > > > > > > intuitive for the users to determine
> > how to
> > > > > set
> > > > > >> > the
> > > > > >> > >> > right
> > > > > >> > >> > > > > > limit.
> > > > > >> > >> > > > > > > > > > However,
> > > > > >> > >> > > > > > > > > > > > this is not completely new and has been
> > > > done
> > > > > in
> > > > > >> > the
> > > > > >> > >> > > > container
> > > > > >> > >> > > > > > > world
> > > > > >> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> > > > > >> > >> > > > > https://access.redhat.com/
> > > > > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > > >> > >> terprise_Linux/6/html/
> > > > > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-
> > cpu.html)
> > > > has
> > > > > >> the
> > > > > >> > >> > concept
> > > > > >> > >> > > of
> > > > > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > >> > >> > > > > > > > > > > > which specifies the total amount of
> > time in
> > > > > >> > >> > microseconds
> > > > > >> > >> > > > for
> > > > > >> > >> > > > > > > which
> > > > > >> > >> > > > > > > > > all
> > > > > >> > >> > > > > > > > > > > > tasks in a cgroup can run during a one
> > > > second
> > > > > >> > >> period.
> > > > > >> > >> > We
> > > > > >> > >> > > > can
> > > > > >> > >> > > > > > > > > > potentially
> > > > > >> > >> > > > > > > > > > > > model the request handler threads in a
> > > > > similar
> > > > > >> > way.
> > > > > >> > >> For
> > > > > >> > >> > > > > > example,
> > > > > >> > >> > > > > > > > each
> > > > > >> > >> > > > > > > > > > > > request handler thread can be 1 request
> > > > > handler
> > > > > >> > unit
> > > > > >> > >> > and
> > > > > >> > >> > > > the
> > > > > >> > >> > > > > > > admin
> > > > > >> > >> > > > > > > > > can
> > > > > >> > >> > > > > > > > > > > > configure a limit on how many units
> > (say
> > > > > 0.01)
> > > > > >> a
> > > > > >> > >> client
> > > > > >> > >> > > can
> > > > > >> > >> > > > > > have.
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > Regarding not throttling the internal
> > > > broker
> > > > > to
> > > > > >> > >> broker
> > > > > >> > >> > > > > > requests.
> > > > > >> > >> > > > > > > We
> > > > > >> > >> > > > > > > > > > could
> > > > > >> > >> > > > > > > > > > > > do that. Alternatively, we could just
> > let
> > > > the
> > > > > >> > admin
> > > > > >> > >> > > > > configure a
> > > > > >> > >> > > > > > > > high
> > > > > >> > >> > > > > > > > > > > limit
> > > > > >> > >> > > > > > > > > > > > for the kafka user (it may not be able
> > to
> > > > do
> > > > > >> that
> > > > > >> > >> > easily
> > > > > >> > >> > > > > based
> > > > > >> > >> > > > > > on
> > > > > >> > >> > > > > > > > > > > clientId
> > > > > >> > >> > > > > > > > > > > > though).
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > Ideally we want to be able to protect
> > the
> > > > > >> > >> utilization
> > > > > >> > >> > of
> > > > > >> > >> > > > the
> > > > > >> > >> > > > > > > > network
> > > > > >> > >> > > > > > > > > > > thread
> > > > > >> > >> > > > > > > > > > > > pool too. The difficult is mostly what
> > > > Rajini
> > > > > >> > said:
> > > > > >> > >> (1)
> > > > > >> > >> > > The
> > > > > >> > >> > > > > > > > mechanism
> > > > > >> > >> > > > > > > > > > for
> > > > > >> > >> > > > > > > > > > > > throttling the requests is through
> > > > Purgatory
> > > > > >> and
> > > > > >> > we
> > > > > >> > >> > will
> > > > > >> > >> > > > have
> > > > > >> > >> > > > > > to
> > > > > >> > >> > > > > > > > > think
> > > > > >> > >> > > > > > > > > > > > through how to integrate that into the
> > > > > network
> > > > > >> > >> layer.
> > > > > >> > >> > > (2)
> > > > > >> > >> > > > In
> > > > > >> > >> > > > > > the
> > > > > >> > >> > > > > > > > > > network
> > > > > >> > >> > > > > > > > > > > > layer, currently we know the user, but
> > not
> > > > > the
> > > > > >> > >> clientId
> > > > > >> > >> > > of
> > > > > >> > >> > > > > the
> > > > > >> > >> > > > > > > > > request.
> > > > > >> > >> > > > > > > > > > > So,
> > > > > >> > >> > > > > > > > > > > > it's a bit tricky to throttle based on
> > > > > clientId
> > > > > >> > >> there.
> > > > > >> > >> > > > Plus,
> > > > > >> > >> > > > > > the
> > > > > >> > >> > > > > > > > > > byteOut
> > > > > >> > >> > > > > > > > > > > > quota can already protect the network
> > > > thread
> > > > > >> > >> > utilization
> > > > > >> > >> > > > for
> > > > > >> > >> > > > > > > fetch
> > > > > >> > >> > > > > > > > > > > > requests. So, if we can't figure out
> > this
> > > > > part
> > > > > >> > right
> > > > > >> > >> > now,
> > > > > >> > >> > > > > just
> > > > > >> > >> > > > > > > > > focusing
> > > > > >> > >> > > > > > > > > > > on
> > > > > >> > >> > > > > > > > > > > > the request handling threads for this
> > KIP
> > > > is
> > > > > >> > still a
> > > > > >> > >> > > useful
> > > > > >> > >> > > > > > > > feature.
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > Thanks,
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > Jun
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini
> > > > > >> Sivaram <
> > > > > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > wrote:
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > Thank you all for the feedback.
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > Jay: I have removed exemption for
> > > > consumer
> > > > > >> > >> heartbeat
> > > > > >> > >> > > etc.
> > > > > >> > >> > > > > > Agree
> > > > > >> > >> > > > > > > > > that
> > > > > >> > >> > > > > > > > > > > > > protecting the cluster is more
> > important
> > > > > than
> > > > > >> > >> > > protecting
> > > > > >> > >> > > > > > > > individual
> > > > > >> > >> > > > > > > > > > > apps.
> > > > > >> > >> > > > > > > > > > > > > Have retained the exemption for
> > > > > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > >> > >> > > > > > etc,
> > > > > >> > >> > > > > > > > > these
> > > > > >> > >> > > > > > > > > > > are
> > > > > >> > >> > > > > > > > > > > > > throttled only if authorization
> > fails (so
> > > > > >> can't
> > > > > >> > be
> > > > > >> > >> > used
> > > > > >> > >> > > > for
> > > > > >> > >> > > > > > DoS
> > > > > >> > >> > > > > > > > > > attacks
> > > > > >> > >> > > > > > > > > > > > in
> > > > > >> > >> > > > > > > > > > > > > a secure cluster, but allows
> > inter-broker
> > > > > >> > >> requests to
> > > > > >> > >> > > > > > complete
> > > > > >> > >> > > > > > > > > > without
> > > > > >> > >> > > > > > > > > > > > > delays).
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > I will wait another day to see if
> > these
> > > > is
> > > > > >> any
> > > > > >> > >> > > objection
> > > > > >> > >> > > > to
> > > > > >> > >> > > > > > > > quotas
> > > > > >> > >> > > > > > > > > > > based
> > > > > >> > >> > > > > > > > > > > > on
> > > > > >> > >> > > > > > > > > > > > > request processing time (as opposed
> > to
> > > > > >> request
> > > > > >> > >> rate)
> > > > > >> > >> > > and
> > > > > >> > >> > > > if
> > > > > >> > >> > > > > > > there
> > > > > >> > >> > > > > > > > > are
> > > > > >> > >> > > > > > > > > > > no
> > > > > >> > >> > > > > > > > > > > > > objections, I will revert to the
> > original
> > > > > >> > proposal
> > > > > >> > >> > with
> > > > > >> > >> > > > > some
> > > > > >> > >> > > > > > > > > changes.
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > The original proposal was only
> > including
> > > > > the
> > > > > >> > time
> > > > > >> > >> > used
> > > > > >> > >> > > by
> > > > > >> > >> > > > > the
> > > > > >> > >> > > > > > > > > request
> > > > > >> > >> > > > > > > > > > > > > handler threads (that made
> > calculation
> > > > > >> easy). I
> > > > > >> > >> think
> > > > > >> > >> > > the
> > > > > >> > >> > > > > > > > > suggestion
> > > > > >> > >> > > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > > include the time spent in the network
> > > > > >> threads as
> > > > > >> > >> well
> > > > > >> > >> > > > since
> > > > > >> > >> > > > > > > that
> > > > > >> > >> > > > > > > > > may
> > > > > >> > >> > > > > > > > > > be
> > > > > >> > >> > > > > > > > > > > > > significant. As Jay pointed out, it
> > is
> > > > more
> > > > > >> > >> > complicated
> > > > > >> > >> > > > to
> > > > > >> > >> > > > > > > > > calculate
> > > > > >> > >> > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > total available CPU time and convert
> > to a
> > > > > >> ratio
> > > > > >> > >> when
> > > > > >> > >> > > > there
> > > > > >> > >> > > > > > *m*
> > > > > >> > >> > > > > > > > I/O
> > > > > >> > >> > > > > > > > > > > > threads
> > > > > >> > >> > > > > > > > > > > > > and *n* network threads.
> > > > > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > >> > >> > > )
> > > > > >> > >> > > > > may
> > > > > >> > >> > > > > > > > give
> > > > > >> > >> > > > > > > > > us
> > > > > >> > >> > > > > > > > > > > > what
> > > > > >> > >> > > > > > > > > > > > > we want, but it can be very
> > expensive on
> > > > > some
> > > > > >> > >> > > platforms.
> > > > > >> > >> > > > As
> > > > > >> > >> > > > > > > > Becket
> > > > > >> > >> > > > > > > > > > and
> > > > > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have
> > > > > several
> > > > > >> > time
> > > > > >> > >> > > > > > measurements
> > > > > >> > >> > > > > > > > > > already
> > > > > >> > >> > > > > > > > > > > > for
> > > > > >> > >> > > > > > > > > > > > > generating metrics that we could use,
> > > > > though
> > > > > >> we
> > > > > >> > >> might
> > > > > >> > >> > > > want
> > > > > >> > >> > > > > to
> > > > > >> > >> > > > > > > > > switch
> > > > > >> > >> > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > > nanoTime() instead of
> > currentTimeMillis()
> > > > > >> since
> > > > > >> > >> some
> > > > > >> > >> > of
> > > > > >> > >> > > > the
> > > > > >> > >> > > > > > > > values
> > > > > >> > >> > > > > > > > > > for
> > > > > >> > >> > > > > > > > > > > > > small requests may be < 1ms. But
> > rather
> > > > > than
> > > > > >> add
> > > > > >> > >> up
> > > > > >> > >> > the
> > > > > >> > >> > > > > time
> > > > > >> > >> > > > > > > > spent
> > > > > >> > >> > > > > > > > > in
> > > > > >> > >> > > > > > > > > > > I/O
> > > > > >> > >> > > > > > > > > > > > > thread and network thread, wouldn't
> > it be
> > > > > >> better
> > > > > >> > >> to
> > > > > >> > >> > > > convert
> > > > > >> > >> > > > > > the
> > > > > >> > >> > > > > > > > > time
> > > > > >> > >> > > > > > > > > > > > spent
> > > > > >> > >> > > > > > > > > > > > > on each thread into a separate ratio?
> > > > UserA
> > > > > >> has
> > > > > >> > a
> > > > > >> > >> > > request
> > > > > >> > >> > > > > > quota
> > > > > >> > >> > > > > > > > of
> > > > > >> > >> > > > > > > > > > 5%.
> > > > > >> > >> > > > > > > > > > > > Can
> > > > > >> > >> > > > > > > > > > > > > we take that to mean that UserA can
> > use
> > > > 5%
> > > > > of
> > > > > >> > the
> > > > > >> > >> > time
> > > > > >> > >> > > on
> > > > > >> > >> > > > > > > network
> > > > > >> > >> > > > > > > > > > > threads
> > > > > >> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If
> > > > > either
> > > > > >> is
> > > > > >> > >> > > exceeded,
> > > > > >> > >> > > > > the
> > > > > >> > >> > > > > > > > > > response
> > > > > >> > >> > > > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > > > throttled - it would mean
> > maintaining two
> > > > > >> sets
> > > > > >> > of
> > > > > >> > >> > > metrics
> > > > > >> > >> > > > > for
> > > > > >> > >> > > > > > > the
> > > > > >> > >> > > > > > > > > two
> > > > > >> > >> > > > > > > > > > > > > durations, but would result in more
> > > > > >> meaningful
> > > > > >> > >> > ratios.
> > > > > >> > >> > > We
> > > > > >> > >> > > > > > could
> > > > > >> > >> > > > > > > > > > define
> > > > > >> > >> > > > > > > > > > > > two
> > > > > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request
> > > > > threads
> > > > > >> > and
> > > > > >> > >> 10%
> > > > > >> > >> > > of
> > > > > >> > >> > > > > > > network
> > > > > >> > >> > > > > > > > > > > > threads),
> > > > > >> > >> > > > > > > > > > > > > but that seems unnecessary and
> > harder to
> > > > > >> explain
> > > > > >> > >> to
> > > > > >> > >> > > > users.
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > Back to why and how quotas are
> > applied to
> > > > > >> > network
> > > > > >> > >> > > thread
> > > > > >> > >> > > > > > > > > utilization:
> > > > > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the time
> > spent
> > > > in
> > > > > >> the
> > > > > >> > >> > network
> > > > > >> > >> > > > > > thread
> > > > > >> > >> > > > > > > > may
> > > > > >> > >> > > > > > > > > be
> > > > > >> > >> > > > > > > > > > > > > significant and I can see the need to
> > > > > include
> > > > > >> > >> this.
> > > > > >> > >> > Are
> > > > > >> > >> > > > > there
> > > > > >> > >> > > > > > > > other
> > > > > >> > >> > > > > > > > > > > > > requests where the network thread
> > > > > >> utilization is
> > > > > >> > >> > > > > significant?
> > > > > >> > >> > > > > > > In
> > > > > >> > >> > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > case
> > > > > >> > >> > > > > > > > > > > > > of fetch, request handler thread
> > > > > utilization
> > > > > >> > would
> > > > > >> > >> > > > throttle
> > > > > >> > >> > > > > > > > clients
> > > > > >> > >> > > > > > > > > > > with
> > > > > >> > >> > > > > > > > > > > > > high request rate, low data volume
> > and
> > > > > fetch
> > > > > >> > byte
> > > > > >> > >> > rate
> > > > > >> > >> > > > > quota
> > > > > >> > >> > > > > > > will
> > > > > >> > >> > > > > > > > > > > > throttle
> > > > > >> > >> > > > > > > > > > > > > clients with high data volume.
> > Network
> > > > > thread
> > > > > >> > >> > > utilization
> > > > > >> > >> > > > > is
> > > > > >> > >> > > > > > > > > perhaps
> > > > > >> > >> > > > > > > > > > > > > proportional to the data volume. I am
> > > > > >> wondering
> > > > > >> > >> if we
> > > > > >> > >> > > > even
> > > > > >> > >> > > > > > need
> > > > > >> > >> > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > throttle
> > > > > >> > >> > > > > > > > > > > > > based on network thread utilization
> > or
> > > > > >> whether
> > > > > >> > the
> > > > > >> > >> > data
> > > > > >> > >> > > > > > volume
> > > > > >> > >> > > > > > > > > quota
> > > > > >> > >> > > > > > > > > > > > covers
> > > > > >> > >> > > > > > > > > > > > > this case.
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > b) At the moment, we record and
> > check for
> > > > > >> quota
> > > > > >> > >> > > violation
> > > > > >> > >> > > > > at
> > > > > >> > >> > > > > > > the
> > > > > >> > >> > > > > > > > > same
> > > > > >> > >> > > > > > > > > > > > time.
> > > > > >> > >> > > > > > > > > > > > > If a quota is violated, the response
> > is
> > > > > >> delayed.
> > > > > >> > >> > Using
> > > > > >> > >> > > > > Jay'e
> > > > > >> > >> > > > > > > > > example
> > > > > >> > >> > > > > > > > > > of
> > > > > >> > >> > > > > > > > > > > > > disk reads for fetches happening in
> > the
> > > > > >> network
> > > > > >> > >> > thread,
> > > > > >> > >> > > > We
> > > > > >> > >> > > > > > > can't
> > > > > >> > >> > > > > > > > > > record
> > > > > >> > >> > > > > > > > > > > > and
> > > > > >> > >> > > > > > > > > > > > > delay a response after the disk
> > reads. We
> > > > > >> could
> > > > > >> > >> > record
> > > > > >> > >> > > > the
> > > > > >> > >> > > > > > time
> > > > > >> > >> > > > > > > > > spent
> > > > > >> > >> > > > > > > > > > > on
> > > > > >> > >> > > > > > > > > > > > > the network thread when the response
> > is
> > > > > >> complete
> > > > > >> > >> and
> > > > > >> > >> > > > > > introduce
> > > > > >> > >> > > > > > > a
> > > > > >> > >> > > > > > > > > > delay
> > > > > >> > >> > > > > > > > > > > > for
> > > > > >> > >> > > > > > > > > > > > > handling a subsequent request
> > (separate
> > > > out
> > > > > >> > >> recording
> > > > > >> > >> > > and
> > > > > >> > >> > > > > > quota
> > > > > >> > >> > > > > > > > > > > violation
> > > > > >> > >> > > > > > > > > > > > > handling in the case of network
> > thread
> > > > > >> > overload).
> > > > > >> > >> > Does
> > > > > >> > >> > > > that
> > > > > >> > >> > > > > > > make
> > > > > >> > >> > > > > > > > > > sense?
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > Regards,
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > Rajini
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM,
> > Becket
> > > > > Qin <
> > > > > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > >> > >> > > > > > > > > > > > wrote:
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the
> > CPU
> > > > time
> > > > > >> is a
> > > > > >> > >> > little
> > > > > >> > >> > > > > > > tricky. I
> > > > > >> > >> > > > > > > > > am
> > > > > >> > >> > > > > > > > > > > > > thinking
> > > > > >> > >> > > > > > > > > > > > > > that maybe we can use the existing
> > > > > request
> > > > > >> > >> > > statistics.
> > > > > >> > >> > > > > They
> > > > > >> > >> > > > > > > are
> > > > > >> > >> > > > > > > > > > > already
> > > > > >> > >> > > > > > > > > > > > > > very detailed so we can probably
> > see
> > > > the
> > > > > >> > >> > approximate
> > > > > >> > >> > > > CPU
> > > > > >> > >> > > > > > time
> > > > > >> > >> > > > > > > > > from
> > > > > >> > >> > > > > > > > > > > it,
> > > > > >> > >> > > > > > > > > > > > > e.g.
> > > > > >> > >> > > > > > > > > > > > > > something like (total_time -
> > > > > >> > >> > > > request/response_queue_time
> > > > > >> > >> > > > > -
> > > > > >> > >> > > > > > > > > > > > remote_time).
> > > > > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > I agree with Guozhang that when a
> > user
> > > > is
> > > > > >> > >> throttled
> > > > > >> > >> > > it
> > > > > >> > >> > > > is
> > > > > >> > >> > > > > > > > likely
> > > > > >> > >> > > > > > > > > > that
> > > > > >> > >> > > > > > > > > > > > we
> > > > > >> > >> > > > > > > > > > > > > > need to see if anything has went
> > wrong
> > > > > >> first,
> > > > > >> > >> and
> > > > > >> > >> > if
> > > > > >> > >> > > > the
> > > > > >> > >> > > > > > > users
> > > > > >> > >> > > > > > > > > are
> > > > > >> > >> > > > > > > > > > > well
> > > > > >> > >> > > > > > > > > > > > > > behaving and just need more
> > resources,
> > > > we
> > > > > >> will
> > > > > >> > >> have
> > > > > >> > >> > > to
> > > > > >> > >> > > > > bump
> > > > > >> > >> > > > > > > up
> > > > > >> > >> > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > quota
> > > > > >> > >> > > > > > > > > > > > > > for them. It is true that
> > > > pre-allocating
> > > > > >> CPU
> > > > > >> > >> time
> > > > > >> > >> > > quota
> > > > > >> > >> > > > > > > > precisely
> > > > > >> > >> > > > > > > > > > for
> > > > > >> > >> > > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > > users is difficult. So in practice
> > it
> > > > > would
> > > > > >> > >> > probably
> > > > > >> > >> > > be
> > > > > >> > >> > > > > > more
> > > > > >> > >> > > > > > > > like
> > > > > >> > >> > > > > > > > > > > first
> > > > > >> > >> > > > > > > > > > > > > set
> > > > > >> > >> > > > > > > > > > > > > > a relative high protective CPU time
> > > > quota
> > > > > >> for
> > > > > >> > >> > > everyone
> > > > > >> > >> > > > > and
> > > > > >> > >> > > > > > > > > increase
> > > > > >> > >> > > > > > > > > > > > that
> > > > > >> > >> > > > > > > > > > > > > > for some individual clients on
> > demand.
> > > > > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM,
> > > > Guozhang
> > > > > >> > Wang <
> > > > > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > wrote:
> > > > > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > This is a great proposal, glad
> > to see
> > > > > it
> > > > > >> > >> > happening.
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU
> > throttling,
> > > > or
> > > > > >> more
> > > > > >> > >> > > > > specifically
> > > > > >> > >> > > > > > > > > > > processing
> > > > > >> > >> > > > > > > > > > > > > time
> > > > > >> > >> > > > > > > > > > > > > > > ratio instead of the request rate
> > > > > >> throttling
> > > > > >> > >> as
> > > > > >> > >> > > well.
> > > > > >> > >> > > > > > > Becket
> > > > > >> > >> > > > > > > > > has
> > > > > >> > >> > > > > > > > > > > very
> > > > > >> > >> > > > > > > > > > > > > > well
> > > > > >> > >> > > > > > > > > > > > > > > summed my rationales above, and
> > one
> > > > > >> thing to
> > > > > >> > >> add
> > > > > >> > >> > > here
> > > > > >> > >> > > > > is
> > > > > >> > >> > > > > > > that
> > > > > >> > >> > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > former
> > > > > >> > >> > > > > > > > > > > > > > > has a good support for both
> > > > "protecting
> > > > > >> > >> against
> > > > > >> > >> > > rogue
> > > > > >> > >> > > > > > > > clients"
> > > > > >> > >> > > > > > > > > as
> > > > > >> > >> > > > > > > > > > > > well
> > > > > >> > >> > > > > > > > > > > > > as
> > > > > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> > > > multi-tenancy
> > > > > >> > usage":
> > > > > >> > >> > when
> > > > > >> > >> > > > > > > thinking
> > > > > >> > >> > > > > > > > > > about
> > > > > >> > >> > > > > > > > > > > > how
> > > > > >> > >> > > > > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > > > > explain this to the end users, I
> > find
> > > > > it
> > > > > >> > >> actually
> > > > > >> > >> > > > more
> > > > > >> > >> > > > > > > > natural
> > > > > >> > >> > > > > > > > > > than
> > > > > >> > >> > > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > > > request rate since as mentioned
> > > > above,
> > > > > >> > >> different
> > > > > >> > >> > > > > requests
> > > > > >> > >> > > > > > > > will
> > > > > >> > >> > > > > > > > > > have
> > > > > >> > >> > > > > > > > > > > > > quite
> > > > > >> > >> > > > > > > > > > > > > > > different "cost", and Kafka today
> > > > > already
> > > > > >> > have
> > > > > >> > >> > > > various
> > > > > >> > >> > > > > > > > request
> > > > > >> > >> > > > > > > > > > > types
> > > > > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata,
> > > > etc),
> > > > > >> > >> because
> > > > > >> > >> > of
> > > > > >> > >> > > > that
> > > > > >> > >> > > > > > the
> > > > > >> > >> > > > > > > > > > request
> > > > > >> > >> > > > > > > > > > > > > rate
> > > > > >> > >> > > > > > > > > > > > > > > throttling may not be as
> > effective
> > > > > >> unless it
> > > > > >> > >> is
> > > > > >> > >> > set
> > > > > >> > >> > > > > very
> > > > > >> > >> > > > > > > > > > > > > conservatively.
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > Regarding to user reactions when
> > they
> > > > > are
> > > > > >> > >> > > throttled,
> > > > > >> > >> > > > I
> > > > > >> > >> > > > > > > think
> > > > > >> > >> > > > > > > > it
> > > > > >> > >> > > > > > > > > > may
> > > > > >> > >> > > > > > > > > > > > > > differ
> > > > > >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
> > > > > discovered /
> > > > > >> > >> guided
> > > > > >> > >> > by
> > > > > >> > >> > > > > > looking
> > > > > >> > >> > > > > > > > at
> > > > > >> > >> > > > > > > > > > > > relative
> > > > > >> > >> > > > > > > > > > > > > > > metrics. So in other words users
> > > > would
> > > > > >> not
> > > > > >> > >> expect
> > > > > >> > >> > > to
> > > > > >> > >> > > > > get
> > > > > >> > >> > > > > > > > > > additional
> > > > > >> > >> > > > > > > > > > > > > > > information by simply being told
> > > > "hey,
> > > > > >> you
> > > > > >> > are
> > > > > >> > >> > > > > > throttled",
> > > > > >> > >> > > > > > > > > which
> > > > > >> > >> > > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > > all
> > > > > >> > >> > > > > > > > > > > > > > > what throttling does; they need
> > to
> > > > > take a
> > > > > >> > >> > follow-up
> > > > > >> > >> > > > > step
> > > > > >> > >> > > > > > > and
> > > > > >> > >> > > > > > > > > see
> > > > > >> > >> > > > > > > > > > > > "hmm,
> > > > > >> > >> > > > > > > > > > > > > > I'm
> > > > > >> > >> > > > > > > > > > > > > > > throttled probably because of
> > ..",
> > > > > which
> > > > > >> is
> > > > > >> > by
> > > > > >> > >> > > > looking
> > > > > >> > >> > > > > at
> > > > > >> > >> > > > > > > > other
> > > > > >> > >> > > > > > > > > > > > metric
> > > > > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm
> > bombarding
> > > > the
> > > > > >> > >> brokers
> > > > > >> > >> > > with
> > > > > >> > >> > > > > > > metadata
> > > > > >> > >> > > > > > > > > > > > request,
> > > > > >> > >> > > > > > > > > > > > > > > which are usually cheap to
> > handle but
> > > > > I'm
> > > > > >> > >> sending
> > > > > >> > >> > > > > > thousands
> > > > > >> > >> > > > > > > > per
> > > > > >> > >> > > > > > > > > > > > second;
> > > > > >> > >> > > > > > > > > > > > > > or
> > > > > >> > >> > > > > > > > > > > > > > > is it because I'm catching up and
> > > > hence
> > > > > >> > >> sending
> > > > > >> > >> > > very
> > > > > >> > >> > > > > > heavy
> > > > > >> > >> > > > > > > > > > fetching
> > > > > >> > >> > > > > > > > > > > > > > request
> > > > > >> > >> > > > > > > > > > > > > > > with large min.bytes, etc.
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > Regarding to the implementation,
> > as
> > > > > once
> > > > > >> > >> > discussed
> > > > > >> > >> > > > with
> > > > > >> > >> > > > > > > Jun,
> > > > > >> > >> > > > > > > > > this
> > > > > >> > >> > > > > > > > > > > > seems
> > > > > >> > >> > > > > > > > > > > > > > not
> > > > > >> > >> > > > > > > > > > > > > > > very difficult since today we are
> > > > > already
> > > > > >> > >> > > collecting
> > > > > >> > >> > > > > the
> > > > > >> > >> > > > > > > > > "thread
> > > > > >> > >> > > > > > > > > > > pool
> > > > > >> > >> > > > > > > > > > > > > > > utilization" metrics, which is a
> > > > single
> > > > > >> > >> > percentage
> > > > > >> > >> > > > > > > > > > > > "aggregateIdleMeter"
> > > > > >> > >> > > > > > > > > > > > > > > value; but we are already
> > effectively
> > > > > >> > >> aggregating
> > > > > >> > >> > > it
> > > > > >> > >> > > > > for
> > > > > >> > >> > > > > > > each
> > > > > >> > >> > > > > > > > > > > > requests
> > > > > >> > >> > > > > > > > > > > > > in
> > > > > >> > >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can
> > just
> > > > > >> extend
> > > > > >> > >> it by
> > > > > >> > >> > > > > > recording
> > > > > >> > >> > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > source
> > > > > >> > >> > > > > > > > > > > > > > > client id when handling them and
> > > > > >> aggregating
> > > > > >> > >> by
> > > > > >> > >> > > > > clientId
> > > > > >> > >> > > > > > as
> > > > > >> > >> > > > > > > > > well
> > > > > >> > >> > > > > > > > > > as
> > > > > >> > >> > > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > > > total aggregate.
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > Guozhang
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM,
> > Jay
> > > > > >> Kreps <
> > > > > >> > >> > > > > > > jay@confluent.io
> > > > > >> > >> > > > > > > > >
> > > > > >> > >> > > > > > > > > > > wrote:
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
> > > > > >> > >> > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > When I thought about it more
> > > > deeply I
> > > > > >> came
> > > > > >> > >> > around
> > > > > >> > >> > > > to
> > > > > >> > >> > > > > > the
> > > > > >> > >> > > > > > > > > > "percent
> > > > > >> > >> > > > > > > > > > > > of
> > > > > >> > >> > > > > > > > > > > > > > > > processing time" metric too. It
> > > > > seems a
> > > > > >> > lot
> > > > > >> > >> > > closer
> > > > > >> > >> > > > to
> > > > > >> > >> > > > > > the
> > > > > >> > >> > > > > > > > > thing
> > > > > >> > >> > > > > > > > > > > we
> > > > > >> > >> > > > > > > > > > > > > > > actually
> > > > > >> > >> > > > > > > > > > > > > > > > care about and need to
> > protect. I
> > > > > also
> > > > > >> > think
> > > > > >> > >> > this
> > > > > >> > >> > > > > would
> > > > > >> > >> > > > > > > be
> > > > > >> > >> > > > > > > > a
> > > > > >> > >> > > > > > > > > > very
> > > > > >> > >> > > > > > > > > > > > > > useful
> > > > > >> > >> > > > > > > > > > > > > > > > metric even in the absence of
> > > > > >> throttling
> > > > > >> > >> just
> > > > > >> > >> > to
> > > > > >> > >> > > > > debug
> > > > > >> > >> > > > > > > > whose
> > > > > >> > >> > > > > > > > > > > using
> > > > > >> > >> > > > > > > > > > > > > > > > capacity.
> > > > > >> > >> > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > Two problems to consider:
> > > > > >> > >> > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > >    1. I agree that for the
> > user it
> > > > is
> > > > > >> > >> > > > understandable
> > > > > >> > >> > > > > > what
> > > > > >> > >> > > > > > > > > lead
> > > > > >> > >> > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > > their
> > > > > >> > >> > > > > > > > > > > > > > > >    being throttled, but it is
> > a bit
> > > > > >> hard
> > > > > >> > to
> > > > > >> > >> > > figure
> > > > > >> > >> > > > > out
> > > > > >> > >> > > > > > > the
> > > > > >> > >> > > > > > > > > safe
> > > > > >> > >> > > > > > > > > > > > range
> > > > > >> > >> > > > > > > > > > > > > > for
> > > > > >> > >> > > > > > > > > > > > > > > >    them. i.e. if I have a new
> > app
> > > > > that
> > > > > >> > will
> > > > > >> > >> > send
> > > > > >> > >> > > > 200
> > > > > >> > >> > > > > > > > > > > messages/sec I
> > > > > >> > >> > > > > > > > > > > > > can
> > > > > >> > >> > > > > > > > > > > > > > > >    probably reason that I'll be
> > > > under
> > > > > >> the
> > > > > >> > >> > > > throttling
> > > > > >> > >> > > > > > > limit
> > > > > >> > >> > > > > > > > of
> > > > > >> > >> > > > > > > > > > 300
> > > > > >> > >> > > > > > > > > > > > > > > req/sec.
> > > > > >> > >> > > > > > > > > > > > > > > >    However if I need to be
> > under a
> > > > > 10%
> > > > > >> CPU
> > > > > >> > >> > > > resources
> > > > > >> > >> > > > > > > limit
> > > > > >> > >> > > > > > > > it
> > > > > >> > >> > > > > > > > > > may
> > > > > >> > >> > > > > > > > > > > > be
> > > > > >> > >> > > > > > > > > > > > > a
> > > > > >> > >> > > > > > > > > > > > > > > bit
> > > > > >> > >> > > > > > > > > > > > > > > >    harder for me to know a
> > priori
> > > > if
> > > > > i
> > > > > >> > will
> > > > > >> > >> or
> > > > > >> > >> > > > won't.
> > > > > >> > >> > > > > > > > > > > > > > > >    2. Calculating the
> > available CPU
> > > > > >> time
> > > > > >> > is
> > > > > >> > >> a
> > > > > >> > >> > bit
> > > > > >> > >> > > > > > > difficult
> > > > > >> > >> > > > > > > > > > since
> > > > > >> > >> > > > > > > > > > > > > there
> > > > > >> > >> > > > > > > > > > > > > > > are
> > > > > >> > >> > > > > > > > > > > > > > > >    actually two thread
> > pools--the
> > > > I/O
> > > > > >> > >> threads
> > > > > >> > >> > and
> > > > > >> > >> > > > the
> > > > > >> > >> > > > > > > > network
> > > > > >> > >> > > > > > > > > > > > > threads.
> > > > > >> > >> > > > > > > > > > > > > > I
> > > > > >> > >> > > > > > > > > > > > > > > > think
> > > > > >> > >> > > > > > > > > > > > > > > >    it might be workable to
> > count
> > > > just
> > > > > >> the
> > > > > >> > >> I/O
> > > > > >> > >> > > > thread
> > > > > >> > >> > > > > > time
> > > > > >> > >> > > > > > > > as
> > > > > >> > >> > > > > > > > > in
> > > > > >> > >> > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > > > > proposal,
> > > > > >> > >> > > > > > > > > > > > > > > >    but the network thread work
> > is
> > > > > >> actually
> > > > > >> > >> > > > > non-trivial
> > > > > >> > >> > > > > > > > (e.g.
> > > > > >> > >> > > > > > > > > > all
> > > > > >> > >> > > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > > disk
> > > > > >> > >> > > > > > > > > > > > > > > >    reads for fetches happen in
> > that
> > > > > >> > >> thread). If
> > > > > >> > >> > > you
> > > > > >> > >> > > > > > count
> > > > > >> > >> > > > > > > > > both
> > > > > >> > >> > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > > > network
> > > > > >> > >> > > > > > > > > > > > > > > > and
> > > > > >> > >> > > > > > > > > > > > > > > >    I/O threads it can skew
> > things a
> > > > > >> bit.
> > > > > >> > >> E.g.
> > > > > >> > >> > say
> > > > > >> > >> > > > you
> > > > > >> > >> > > > > > > have
> > > > > >> > >> > > > > > > > 50
> > > > > >> > >> > > > > > > > > > > > network
> > > > > >> > >> > > > > > > > > > > > > > > > threads,
> > > > > >> > >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores,
> > > > what
> > > > > is
> > > > > >> > the
> > > > > >> > >> > > > available
> > > > > >> > >> > > > > > cpu
> > > > > >> > >> > > > > > > > > time
> > > > > >> > >> > > > > > > > > > > > > > available
> > > > > >> > >> > > > > > > > > > > > > > > > in a
> > > > > >> > >> > > > > > > > > > > > > > > >    second? I suppose this is a
> > > > > problem
> > > > > >> > >> whenever
> > > > > >> > >> > > you
> > > > > >> > >> > > > > > have
> > > > > >> > >> > > > > > > a
> > > > > >> > >> > > > > > > > > > > > bottleneck
> > > > > >> > >> > > > > > > > > > > > > > > > between
> > > > > >> > >> > > > > > > > > > > > > > > >    I/O and network threads or
> > if
> > > > you
> > > > > >> end
> > > > > >> > up
> > > > > >> > >> > > > > > significantly
> > > > > >> > >> > > > > > > > > > > > > > > over-provisioning
> > > > > >> > >> > > > > > > > > > > > > > > >    one pool (both of which are
> > hard
> > > > > to
> > > > > >> > >> avoid).
> > > > > >> > >> > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > An alternative for CPU
> > throttling
> > > > > >> would be
> > > > > >> > >> to
> > > > > >> > >> > use
> > > > > >> > >> > > > > this
> > > > > >> > >> > > > > > > api:
> > > > > >> > >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> > > > > >> > >> > > > > > 1.5.0/docs/api/java/lang/
> > > > > >> > >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
> > > > > >> > >> > > > getThreadCpuTime(long)
> > > > > >> > >> > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > That would let you track
> > actual CPU
> > > > > >> usage
> > > > > >> > >> > across
> > > > > >> > >> > > > the
> > > > > >> > >> > > > > > > > network,
> > > > > >> > >> > > > > > > > > > I/O
> > > > > >> > >> > > > > > > > > > > > > > > threads,
> > > > > >> > >> > > > > > > > > > > > > > > > and purgatory threads and look
> > at
> > > > it
> > > > > >> as a
> > > > > >> > >> > > > percentage
> > > > > >> > >> > > > > of
> > > > > >> > >> > > > > > > > total
> > > > > >> > >> > > > > > > > > > > > cores.
> > > > > >> > >> > > > > > > > > > > > > I
> > > > > >> > >> > > > > > > > > > > > > > > > think this fixes many problems
> > in
> > > > the
> > > > > >> > >> > reliability
> > > > > >> > >> > > > of
> > > > > >> > >> > > > > > the
> > > > > >> > >> > > > > > > > > > metric.
> > > > > >> > >> > > > > > > > > > > > It's
> > > > > >> > >> > > > > > > > > > > > > > > > meaning is slightly different
> > as it
> > > > > is
> > > > > >> > just
> > > > > >> > >> CPU
> > > > > >> > >> > > > (you
> > > > > >> > >> > > > > > > don't
> > > > > >> > >> > > > > > > > > get
> > > > > >> > >> > > > > > > > > > > > > charged
> > > > > >> > >> > > > > > > > > > > > > > > for
> > > > > >> > >> > > > > > > > > > > > > > > > time blocking on I/O) but that
> > may
> > > > be
> > > > > >> okay
> > > > > >> > >> > > because
> > > > > >> > >> > > > we
> > > > > >> > >> > > > > > > > already
> > > > > >> > >> > > > > > > > > > > have
> > > > > >> > >> > > > > > > > > > > > a
> > > > > >> > >> > > > > > > > > > > > > > > > throttle on I/O. The downside
> > is I
> > > > > >> think
> > > > > >> > it
> > > > > >> > >> is
> > > > > >> > >> > > > > possible
> > > > > >> > >> > > > > > > > this
> > > > > >> > >> > > > > > > > > > api
> > > > > >> > >> > > > > > > > > > > > can
> > > > > >> > >> > > > > > > > > > > > > be
> > > > > >> > >> > > > > > > > > > > > > > > > disabled or isn't always
> > available
> > > > > and
> > > > > >> it
> > > > > >> > >> may
> > > > > >> > >> > > also
> > > > > >> > >> > > > be
> > > > > >> > >> > > > > > > > > expensive
> > > > > >> > >> > > > > > > > > > > > (also
> > > > > >> > >> > > > > > > > > > > > > > > I've
> > > > > >> > >> > > > > > > > > > > > > > > > never used it so not sure if it
> > > > > really
> > > > > >> > works
> > > > > >> > >> > the
> > > > > >> > >> > > > way
> > > > > >> > >> > > > > i
> > > > > >> > >> > > > > > > > > think).
> > > > > >> > >> > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > -Jay
> > > > > >> > >> > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17
> > PM,
> > > > > Becket
> > > > > >> > Qin
> > > > > >> > >> <
> > > > > >> > >> > > > > > > > > > > becket.qin@gmail.com>
> > > > > >> > >> > > > > > > > > > > > > > > wrote:
> > > > > >> > >> > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > If the purpose of the KIP is
> > only
> > > > > to
> > > > > >> > >> protect
> > > > > >> > >> > > the
> > > > > >> > >> > > > > > > cluster
> > > > > >> > >> > > > > > > > > from
> > > > > >> > >> > > > > > > > > > > > being
> > > > > >> > >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients
> > and
> > > > is
> > > > > >> not
> > > > > >> > >> > > intended
> > > > > >> > >> > > > to
> > > > > >> > >> > > > > > > > address
> > > > > >> > >> > > > > > > > > > > > > resource
> > > > > >> > >> > > > > > > > > > > > > > > > > allocation problem among the
> > > > > >> clients, I
> > > > > >> > am
> > > > > >> > >> > > > > wondering
> > > > > >> > >> > > > > > if
> > > > > >> > >> > > > > > > > > using
> > > > > >> > >> > > > > > > > > > > > > request
> > > > > >> > >> > > > > > > > > > > > > > > > > handling time quota (CPU time
> > > > > quota)
> > > > > >> is
> > > > > >> > a
> > > > > >> > >> > > better
> > > > > >> > >> > > > > > > option.
> > > > > >> > >> > > > > > > > > Here
> > > > > >> > >> > > > > > > > > > > are
> > > > > >> > >> > > > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > > > > > reasons:
> > > > > >> > >> > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > 1. request handling time
> > quota
> > > > has
> > > > > >> > better
> > > > > >> > >> > > > > protection.
> > > > > >> > >> > > > > > > Say
> > > > > >> > >> > > > > > > > > we
> > > > > >> > >> > > > > > > > > > > have
> > > > > >> > >> > > > > > > > > > > > > > > request
> > > > > >> > >> > > > > > > > > > > > > > > > > rate quota and set that to
> > some
> > > > > value
> > > > > >> > like
> > > > > >> > >> > 100
> > > > > >> > >> > > > > > > > > requests/sec,
> > > > > >> > >> > > > > > > > > > it
> > > > > >> > >> > > > > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > > > > > > possible
> > > > > >> > >> > > > > > > > > > > > > > > > > that some of the requests are
> > > > very
> > > > > >> > >> expensive
> > > > > >> > >> > > > > actually
> > > > > >> > >> > > > > > > > take
> > > > > >> > >> > > > > > > > > a
> > > > > >> > >> > > > > > > > > > > lot
> > > > > >> > >> > > > > > > > > > > > of
> > > > > >> > >> > > > > > > > > > > > > > > time
> > > > > >> > >> > > > > > > > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > > > > > > handle. In that case a few
> > > > clients
> > > > > >> may
> > > > > >> > >> still
> > > > > >> > >> > > > > occupy a
> > > > > >> > >> > > > > > > lot
> > > > > >> > >> > > > > > > > > of
> > > > > >> > >> > > > > > > > > > > CPU
> > > > > >> > >> > > > > > > > > > > > > time
> > > > > >> > >> > > > > > > > > > > > > > > > even
> > > > > >> > >> > > > > > > > > > > > > > > > > the request rate is low.
> > Arguably
> > > > > we
> > > > > >> can
> > > > > >> > >> > > > carefully
> > > > > >> > >> > > > > > set
> > > > > >> > >> > > > > > > > > > request
> > > > > >> > >> > > > > > > > > > > > rate
> > > > > >> > >> > > > > > > > > > > > > > > quota
> > > > > >> > >> > > > > > > > > > > > > > > > > for each request and client
> > id
> > > > > >> > >> combination,
> > > > > >> > >> > but
> > > > > >> > >> > > > it
> > > > > >> > >> > > > > > > could
> > > > > >> > >> > > > > > > > > > still
> > > > > >> > >> > > > > > > > > > > be
> > > > > >> > >> > > > > > > > > > > > > > > tricky
> > > > > >> > >> > > > > > > > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > > > > > > get it right for everyone.
> > > > > >> > >> > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > If we use the request time
> > > > handling
> > > > > >> > >> quota, we
> > > > > >> > >> > > can
> > > > > >> > >> > > > > > > simply
> > > > > >> > >> > > > > > > > > say
> > > > > >> > >> > > > > > > > > > no
> > > > > >> > >> > > > > > > > > > > > > > clients
> > > > > >> > >> > > > > > > > > > > > > > > > can
> > > > > >> > >> > > > > > > > > > > > > > > > > take up to more than 30% of
> > the
> > > > > total
> > > > > >> > >> request
> > > > > >> > >> > > > > > handling
> > > > > >> > >> > > > > > > > > > capacity
> > > > > >> > >> > > > > > > > > > > > > > > (measured
> > > > > >> > >> > > > > > > > > > > > > > > > > by time), regardless of the
> > > > > >> difference
> > > > > >> > >> among
> > > > > >> > >> > > > > > different
> > > > > >> > >> > > > > > > > > > requests
> > > > > >> > >> > > > > > > > > > > > or
> > > > > >> > >> > > > > > > > > > > > > > what
> > > > > >> > >> > > > > > > > > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > > > > > > > the client doing. In this
> > case
> > > > > maybe
> > > > > >> we
> > > > > >> > >> can
> > > > > >> > >> > > quota
> > > > > >> > >> > > > > all
> > > > > >> > >> > > > > > > the
> > > > > >> > >> > > > > > > > > > > > requests
> > > > > >> > >> > > > > > > > > > > > > if
> > > > > >> > >> > > > > > > > > > > > > > > we
> > > > > >> > >> > > > > > > > > > > > > > > > > want to.
> > > > > >> > >> > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > 2. The main benefit of using
> > > > > request
> > > > > >> > rate
> > > > > >> > >> > limit
> > > > > >> > >> > > > is
> > > > > >> > >> > > > > > that
> > > > > >> > >> > > > > > > > it
> > > > > >> > >> > > > > > > > > > > seems
> > > > > >> > >> > > > > > > > > > > > > more
> > > > > >> > >> > > > > > > > > > > > > > > > > intuitive. It is true that
> > it is
> > > > > >> > probably
> > > > > >> > >> > > easier
> > > > > >> > >> > > > to
> > > > > >> > >> > > > > > > > explain
> > > > > >> > >> > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > > user
> > > > > >> > >> > > > > > > > > > > > > > > > > what does that mean.
> > However, in
> > > > > >> > practice
> > > > > >> > >> it
> > > > > >> > >> > > > looks
> > > > > >> > >> > > > > > the
> > > > > >> > >> > > > > > > > > impact
> > > > > >> > >> > > > > > > > > > > of
> > > > > >> > >> > > > > > > > > > > > > > > request
> > > > > >> > >> > > > > > > > > > > > > > > > > rate quota is not more
> > > > quantifiable
> > > > > >> than
> > > > > >> > >> the
> > > > > >> > >> > > > > request
> > > > > >> > >> > > > > > > > > handling
> > > > > >> > >> > > > > > > > > > > > time
> > > > > >> > >> > > > > > > > > > > > > > > quota.
> > > > > >> > >> > > > > > > > > > > > > > > > > Unlike the byte rate quota,
> > it is
> > > > > >> still
> > > > > >> > >> > > difficult
> > > > > >> > >> > > > > to
> > > > > >> > >> > > > > > > > give a
> > > > > >> > >> > > > > > > > > > > > number
> > > > > >> > >> > > > > > > > > > > > > > > about
> > > > > >> > >> > > > > > > > > > > > > > > > > impact of throughput or
> > latency
> > > > > when
> > > > > >> a
> > > > > >> > >> > request
> > > > > >> > >> > > > rate
> > > > > >> > >> > > > > > > quota
> > > > > >> > >> > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > hit.
> > > > > >> > >> > > > > > > > > > > > > So
> > > > > >> > >> > > > > > > > > > > > > > it
> > > > > >> > >> > > > > > > > > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > > > > > > > not better than the request
> > > > > handling
> > > > > >> > time
> > > > > >> > >> > > quota.
> > > > > >> > >> > > > In
> > > > > >> > >> > > > > > > fact
> > > > > >> > >> > > > > > > > I
> > > > > >> > >> > > > > > > > > > feel
> > > > > >> > >> > > > > > > > > > > > it
> > > > > >> > >> > > > > > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > > > > > > > clearer to tell user that
> > "you
> > > > are
> > > > > >> > limited
> > > > > >> > >> > > > because
> > > > > >> > >> > > > > > you
> > > > > >> > >> > > > > > > > have
> > > > > >> > >> > > > > > > > > > > taken
> > > > > >> > >> > > > > > > > > > > > > 30%
> > > > > >> > >> > > > > > > > > > > > > > > of
> > > > > >> > >> > > > > > > > > > > > > > > > > the CPU time on the broker"
> > than
> > > > > >> > otherwise
> > > > > >> > >> > > > > something
> > > > > >> > >> > > > > > > like
> > > > > >> > >> > > > > > > > > > "your
> > > > > >> > >> > > > > > > > > > > > > > request
> > > > > >> > >> > > > > > > > > > > > > > > > > rate quota on metadata
> > request
> > > > has
> > > > > >> > >> reached".
> > > > > >> > >> > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > Thanks,
> > > > > >> > >> > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > >> > >> > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23
> > PM,
> > > > > Jay
> > > > > >> > >> Kreps <
> > > > > >> > >> > > > > > > > > jay@confluent.io
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > wrote:
> > > > > >> > >> > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > I think this proposal
> > makes a
> > > > lot
> > > > > >> of
> > > > > >> > >> sense
> > > > > >> > >> > > > > > > (especially
> > > > > >> > >> > > > > > > > > now
> > > > > >> > >> > > > > > > > > > > that
> > > > > >> > >> > > > > > > > > > > > > it
> > > > > >> > >> > > > > > > > > > > > > > is
> > > > > >> > >> > > > > > > > > > > > > > > > > > oriented around request
> > rate)
> > > > and
> > > > > >> > fills
> > > > > >> > >> the
> > > > > >> > >> > > > > biggest
> > > > > >> > >> > > > > > > > > > remaining
> > > > > >> > >> > > > > > > > > > > > gap
> > > > > >> > >> > > > > > > > > > > > > > in
> > > > > >> > >> > > > > > > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > > > > > > multi-tenancy story.
> > > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > I think for intra-cluster
> > > > > >> > communication
> > > > > >> > >> > > > > > (StopReplica,
> > > > > >> > >> > > > > > > > > etc)
> > > > > >> > >> > > > > > > > > > we
> > > > > >> > >> > > > > > > > > > > > > could
> > > > > >> > >> > > > > > > > > > > > > > > > avoid
> > > > > >> > >> > > > > > > > > > > > > > > > > > throttling entirely. You
> > can
> > > > > >> secure or
> > > > > >> > >> > > > otherwise
> > > > > >> > >> > > > > > > > > lock-down
> > > > > >> > >> > > > > > > > > > > the
> > > > > >> > >> > > > > > > > > > > > > > > cluster
> > > > > >> > >> > > > > > > > > > > > > > > > > > communication to avoid any
> > > > > >> > unauthorized
> > > > > >> > >> > > > external
> > > > > >> > >> > > > > > > party
> > > > > >> > >> > > > > > > > > from
> > > > > >> > >> > > > > > > > > > > > > trying
> > > > > >> > >> > > > > > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > > > > > > > initiate these requests.
> > As a
> > > > > >> result
> > > > > >> > we
> > > > > >> > >> are
> > > > > >> > >> > > as
> > > > > >> > >> > > > > > likely
> > > > > >> > >> > > > > > > > to
> > > > > >> > >> > > > > > > > > > > cause
> > > > > >> > >> > > > > > > > > > > > > > > problems
> > > > > >> > >> > > > > > > > > > > > > > > > > as
> > > > > >> > >> > > > > > > > > > > > > > > > > > solve them by throttling
> > these,
> > > > > >> right?
> > > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > I'm not so sure that we
> > should
> > > > > >> exempt
> > > > > >> > >> the
> > > > > >> > >> > > > > consumer
> > > > > >> > >> > > > > > > > > requests
> > > > > >> > >> > > > > > > > > > > > such
> > > > > >> > >> > > > > > > > > > > > > as
> > > > > >> > >> > > > > > > > > > > > > > > > > > heartbeat. It's true that
> > if we
> > > > > >> > >> throttle an
> > > > > >> > >> > > > app's
> > > > > >> > >> > > > > > > > > heartbeat
> > > > > >> > >> > > > > > > > > > > > > > requests
> > > > > >> > >> > > > > > > > > > > > > > > it
> > > > > >> > >> > > > > > > > > > > > > > > > > may
> > > > > >> > >> > > > > > > > > > > > > > > > > > cause it to fall out of its
> > > > > >> consumer
> > > > > >> > >> group.
> > > > > >> > >> > > > > However
> > > > > >> > >> > > > > > > if
> > > > > >> > >> > > > > > > > we
> > > > > >> > >> > > > > > > > > > > don't
> > > > > >> > >> > > > > > > > > > > > > > > > throttle
> > > > > >> > >> > > > > > > > > > > > > > > > > it
> > > > > >> > >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if
> > the
> > > > > >> > heartbeat
> > > > > >> > >> > > > interval
> > > > > >> > >> > > > > > is
> > > > > >> > >> > > > > > > > set
> > > > > >> > >> > > > > > > > > > > > > > incorrectly
> > > > > >> > >> > > > > > > > > > > > > > > or
> > > > > >> > >> > > > > > > > > > > > > > > > > if
> > > > > >> > >> > > > > > > > > > > > > > > > > > some client in some
> > language
> > > > has
> > > > > a
> > > > > >> > bug.
> > > > > >> > >> I
> > > > > >> > >> > > think
> > > > > >> > >> > > > > the
> > > > > >> > >> > > > > > > > > policy
> > > > > >> > >> > > > > > > > > > > with
> > > > > >> > >> > > > > > > > > > > > > > this
> > > > > >> > >> > > > > > > > > > > > > > > > kind
> > > > > >> > >> > > > > > > > > > > > > > > > > > of throttling is to
> > protect the
> > > > > >> > cluster
> > > > > >> > >> > above
> > > > > >> > >> > > > any
> > > > > >> > >> > > > > > > > > > individual
> > > > > >> > >> > > > > > > > > > > > app,
> > > > > >> > >> > > > > > > > > > > > > > > > right?
> > > > > >> > >> > > > > > > > > > > > > > > > > I
> > > > > >> > >> > > > > > > > > > > > > > > > > > think in general this
> > should be
> > > > > >> okay
> > > > > >> > >> since
> > > > > >> > >> > > for
> > > > > >> > >> > > > > most
> > > > > >> > >> > > > > > > > > > > deployments
> > > > > >> > >> > > > > > > > > > > > > > this
> > > > > >> > >> > > > > > > > > > > > > > > > > > setting is meant as more
> > of a
> > > > > >> safety
> > > > > >> > >> > > > valve---that
> > > > > >> > >> > > > > > is
> > > > > >> > >> > > > > > > > > rather
> > > > > >> > >> > > > > > > > > > > > than
> > > > > >> > >> > > > > > > > > > > > > > set
> > > > > >> > >> > > > > > > > > > > > > > > > > > something very close to
> > what
> > > > you
> > > > > >> > expect
> > > > > >> > >> to
> > > > > >> > >> > > need
> > > > > >> > >> > > > > > (say
> > > > > >> > >> > > > > > > 2
> > > > > >> > >> > > > > > > > > > > req/sec
> > > > > >> > >> > > > > > > > > > > > or
> > > > > >> > >> > > > > > > > > > > > > > > > > whatever)
> > > > > >> > >> > > > > > > > > > > > > > > > > > you would have something
> > quite
> > > > > high
> > > > > >> > >> (like
> > > > > >> > >> > 100
> > > > > >> > >> > > > > > > req/sec)
> > > > > >> > >> > > > > > > > > with
> > > > > >> > >> > > > > > > > > > > > this
> > > > > >> > >> > > > > > > > > > > > > > > meant
> > > > > >> > >> > > > > > > > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > > > > > > > prevent a client gone
> > crazy. I
> > > > > >> think
> > > > > >> > >> when
> > > > > >> > >> > > used
> > > > > >> > >> > > > > this
> > > > > >> > >> > > > > > > way
> > > > > >> > >> > > > > > > > > > > > allowing
> > > > > >> > >> > > > > > > > > > > > > > > those
> > > > > >> > >> > > > > > > > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > > > > > > > be throttled would actually
> > > > > provide
> > > > > >> > >> > > meaningful
> > > > > >> > >> > > > > > > > > protection.
> > > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > -Jay
> > > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at
> > 9:05
> > > > AM,
> > > > > >> > Rajini
> > > > > >> > >> > > > Sivaram <
> > > > > >> > >> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > wrote:
> > > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > > Hi all,
> > > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > > I have just created
> > KIP-124
> > > > to
> > > > > >> > >> introduce
> > > > > >> > >> > > > > request
> > > > > >> > >> > > > > > > rate
> > > > > >> > >> > > > > > > > > > > quotas
> > > > > >> > >> > > > > > > > > > > > to
> > > > > >> > >> > > > > > > > > > > > > > > > Kafka:
> > > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > >
> > https://cwiki.apache.org/
> > > > > >> > >> > > > > > > > confluence/display/KAFKA/KIP-
> > > > > >> > >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > > The proposal is for a
> > simple
> > > > > >> > >> percentage
> > > > > >> > >> > > > request
> > > > > >> > >> > > > > > > > > handling
> > > > > >> > >> > > > > > > > > > > time
> > > > > >> > >> > > > > > > > > > > > > > quota
> > > > > >> > >> > > > > > > > > > > > > > > > > that
> > > > > >> > >> > > > > > > > > > > > > > > > > > > can be allocated to
> > > > > >> *<client-id>*,
> > > > > >> > >> > *<user>*
> > > > > >> > >> > > > or
> > > > > >> > >> > > > > > > > *<user,
> > > > > >> > >> > > > > > > > > > > > > > client-id>*.
> > > > > >> > >> > > > > > > > > > > > > > > > > There
> > > > > >> > >> > > > > > > > > > > > > > > > > > > are a few other
> > suggestions
> > > > > also
> > > > > >> > under
> > > > > >> > >> > > > > "Rejected
> > > > > >> > >> > > > > > > > > > > > alternatives".
> > > > > >> > >> > > > > > > > > > > > > > > > > Feedback
> > > > > >> > >> > > > > > > > > > > > > > > > > > > and suggestions are
> > welcome.
> > > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > > Thank you...
> > > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > > Regards,
> > > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > > > Rajini
> > > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > > > --
> > > > > >> > >> > > > > > > > > > > > > > > -- Guozhang
> > > > > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > > >
> > > > > >> > >> > > > > > > > > > >
> > > > > >> > >> > > > > > > > > >
> > > > > >> > >> > > > > > > > >
> > > > > >> > >> > > > > > > >
> > > > > >> > >> > > > > > >
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > > > --
> > > > > >> > >> > > > > > -- Guozhang
> > > > > >> > >> > > > > >
> > > > > >> > >> > > > >
> > > > > >> > >> > > >
> > > > > >> > >> > >
> > > > > >> > >> >
> > > > > >> > >>
> > > > > >> > >
> > > > > >> > >
> > > > > >> >
> > > > > >>
> > > > > >
> > > > > >
> > > > >
> > > >
> >

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Colin,

Thank you for the feedback. Since we are reusing the existing
throttle_time_ms field for produce/fetch responses, changing this to
microseconds would be a breaking change. Since we don't currently plan to
throttle at sub-millisecond intervals, perhaps it makes sense to keep the
value consistent with the existing responses (and metrics which expose this
value) and change them all together in future if required?

Regards,

Rajini

On Tue, Feb 28, 2017 at 5:58 PM, Colin McCabe <cm...@apache.org> wrote:

> I noticed that the throttle_time_ms added to all the message responses
> is in milliseconds.  Does it make sense to express this in microseconds
> in case we start doing more fine-grained CPU throttling later on?  An
> int32 should still be more than enough if using microseconds.
>
> best,
> Colin
>
>
> On Fri, Feb 24, 2017, at 10:31, Jun Rao wrote:
> > Hi, Jay,
> >
> > 2. Regarding request.unit vs request.percentage. I started with
> > request.percentage too. The reasoning for request.unit is the following.
> > Suppose that the capacity has been reached on a broker and the admin
> > needs
> > to add a new user. A simple way to increase the capacity is to increase
> > the
> > number of io threads, assuming there are still enough cores. If the limit
> > is based on percentage, the additional capacity automatically gets
> > distributed to existing users and we haven't really carved out any
> > additional resource for the new user. Now, is it easy for a user to
> > reason
> > about 0.1 unit vs 10%. My feeling is that both are hard and have to be
> > configured empirically. Not sure if percentage is obviously easier to
> > reason about.
> >
> > Thanks,
> >
> > Jun
> >
> > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io> wrote:
> >
> > > A couple of quick points:
> > >
> > > 1. Even though the implementation of this quota is only using io thread
> > > time, i think we should call it something like "request-time". This
> will
> > > give us flexibility to improve the implementation to cover network
> threads
> > > in the future and will avoid exposing internal details like our thread
> > > pools on the server.
> > >
> > > 2. Jun/Roger, I get what you are trying to fix but the idea of
> thread/units
> > > is super unintuitive as a user-facing knob. I had to read the KIP like
> > > eight times to understand this. I'm not sure that your point that
> > > increasing the number of threads is a problem with a percentage-based
> > > value, it really depends on whether the user thinks about the
> "percentage
> > > of request processing time" or "thread units". If they think "I have
> > > allocated 10% of my request processing time to user x" then it is a bug
> > > that increasing the thread count decreases that percent as it does in
> the
> > > current proposal. As a practical matter I think the only way to
> actually
> > > reason about this is as a percent---I just don't believe people are
> going
> > > to think, "ah, 4.3 thread units, that is the right amount!". Instead I
> > > think they have to understand this thread unit concept, figure out what
> > > they have set in number of threads, compute a percent and then come up
> with
> > > the number of thread units, and these will all be wrong if that thread
> > > count changes. I also think this ties us to throttling the I/O thread
> pool,
> > > which may not be where we want to end up.
> > >
> > > 3. For what it's worth I do think having a single throttle_ms field in
> all
> > > the responses that combines all throttling from all quotas is probably
> the
> > > simplest. There could be a use case for having separate fields for
> each,
> > > but I think that is actually harder to use/monitor in the common case
> so
> > > unless someone has a use case I think just one should be fine.
> > >
> > > -Jay
> > >
> > > On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> rajinisivaram@gmail.com>
> > > wrote:
> > >
> > > > I have updated the KIP based on the discussions so far.
> > > >
> > > >
> > > > Regards,
> > > >
> > > > Rajini
> > > >
> > > > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > rajinisivaram@gmail.com>
> > > > wrote:
> > > >
> > > > > Thank you all for the feedback.
> > > > >
> > > > > Ismael #1. It makes sense not to throttle inter-broker requests
> like
> > > > > LeaderAndIsr etc. The simplest way to ensure that clients cannot
> use
> > > > these
> > > > > requests to bypass quotas for DoS attacks is to ensure that ACLs
> > > prevent
> > > > > clients from using these requests and unauthorized requests are
> > > included
> > > > > towards quotas.
> > > > >
> > > > > Ismael #2, Jay #1 : I was thinking that these quotas can return a
> > > > separate
> > > > > throttle time, and all utilization based quotas could use the same
> > > field
> > > > > (we won't add another one for network thread utilization for
> instance).
> > > > But
> > > > > perhaps it makes sense to keep byte rate quotas separate in
> > > produce/fetch
> > > > > responses to provide separate metrics? Agree with Ismael that the
> name
> > > of
> > > > > the existing field should be changed if we have two. Happy to
> switch
> > > to a
> > > > > single combined throttle time if that is sufficient.
> > > > >
> > > > > Ismael #4, #5, #6: Will update KIP. Will use dot separated name
> for new
> > > > > property. Replication quotas use dot separated, so it will be
> > > consistent
> > > > > with all properties except byte rate quotas.
> > > > >
> > > > > Radai: #1 Request processing time rather than request rate were
> chosen
> > > > > because the time per request can vary significantly between
> requests as
> > > > > mentioned in the discussion and KIP.
> > > > > #2 Two separate quotas for heartbeats/regular requests feel like
> more
> > > > > configuration and more metrics. Since most users would set quotas
> > > higher
> > > > > than the expected usage and quotas are more of a safety net, a
> single
> > > > quota
> > > > > should work in most cases.
> > > > >  #3 The number of requests in purgatory is limited by the number of
> > > > active
> > > > > connections since only one request per connection will be
> throttled at
> > > a
> > > > > time.
> > > > > #4 As with byte rate quotas, to use the full allocated quotas,
> > > > > clients/users would need to use partitions that are distributed
> across
> > > > the
> > > > > cluster. The alternative of using cluster-wide quotas instead of
> > > > per-broker
> > > > > quotas would be far too complex to implement.
> > > > >
> > > > > Dong : We currently have two ClientQuotaManagers for quota types
> Fetch
> > > > and
> > > > > Produce. A new one will be added for IOThread, which manages
> quotas for
> > > > I/O
> > > > > thread utilization. This will not update the Fetch or Produce
> > > queue-size,
> > > > > but will have a separate metric for the queue-size.  I wasn't
> planning
> > > to
> > > > > add any additional metrics apart from the equivalent ones for
> existing
> > > > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
> > > utilization
> > > > > could be slightly misleading since it depends on the sequence of
> > > > requests.
> > > > > But we can look into more metrics after the KIP is implemented if
> > > > required.
> > > > >
> > > > > I think we need to limit the maximum delay since all requests are
> > > > > throttled. If a client has a quota of 0.001 units and a single
> request
> > > > used
> > > > > 50ms, we don't want to delay all requests from the client by 50
> > > seconds,
> > > > > throwing the client out of all its consumer groups. The issue is
> only
> > > if
> > > > a
> > > > > user is allocated a quota that is insufficient to process one large
> > > > > request. The expectation is that the units allocated per user will
> be
> > > > much
> > > > > higher than the time taken to process one request and the limit
> should
> > > > > seldom be applied. Agree this needs proper documentation.
> > > > >
> > > > > Regards,
> > > > >
> > > > > Rajini
> > > > >
> > > > >
> > > > > On Thu, Feb 23, 2017 at 8:04 PM, radai <radai.rosenblatt@gmail.com
> >
> > > > wrote:
> > > > >
> > > > >> @jun: i wasnt concerned about tying up a request processing
> thread,
> > > but
> > > > >> IIUC the code does still read the entire request out, which might
> > > add-up
> > > > >> to
> > > > >> a non-negligible amount of memory.
> > > > >>
> > > > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com>
> > > wrote:
> > > > >>
> > > > >> > Hey Rajini,
> > > > >> >
> > > > >> > The current KIP says that the maximum delay will be reduced to
> > > window
> > > > >> size
> > > > >> > if it is larger than the window size. I have a concern with
> this:
> > > > >> >
> > > > >> > 1) This essentially means that the user is allowed to exceed
> their
> > > > quota
> > > > >> > over a long period of time. Can you provide an upper bound on
> this
> > > > >> > deviation?
> > > > >> >
> > > > >> > 2) What is the motivation for cap the maximum delay by the
> window
> > > > size?
> > > > >> I
> > > > >> > am wondering if there is better alternative to address the
> problem.
> > > > >> >
> > > > >> > 3) It means that the existing metric-related config will have a
> more
> > > > >> > directly impact on the mechanism of this io-thread-unit-based
> quota.
> > > > The
> > > > >> > may be an important change depending on the answer to 1) above.
> We
> > > > >> probably
> > > > >> > need to document this more explicitly.
> > > > >> >
> > > > >> > Dong
> > > > >> >
> > > > >> >
> > > > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <lindong28@gmail.com
> >
> > > > wrote:
> > > > >> >
> > > > >> > > Hey Jun,
> > > > >> > >
> > > > >> > > Yeah you are right. I thought it wasn't because at LinkedIn it
> > > will
> > > > be
> > > > >> > too
> > > > >> > > much pressure on inGraph to expose those per-clientId metrics
> so
> > > we
> > > > >> ended
> > > > >> > > up printing them periodically to local log. Never mind if it
> is
> > > not
> > > > a
> > > > >> > > general problem.
> > > > >> > >
> > > > >> > > Hey Rajini,
> > > > >> > >
> > > > >> > > - I agree with Jay that we probably don't want to add a new
> field
> > > > for
> > > > >> > > every quota ProduceResponse or FetchResponse. Is there any
> > > use-case
> > > > >> for
> > > > >> > > having separate throttle-time fields for byte-rate-quota and
> > > > >> > > io-thread-unit-quota? You probably need to document this as
> > > > interface
> > > > >> > > change if you plan to add new field in any request.
> > > > >> > >
> > > > >> > > - I don't think IOThread belongs to quotaType. The existing
> quota
> > > > >> types
> > > > >> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication)
> > > identify
> > > > >> the
> > > > >> > > type of request that are throttled, not the quota mechanism
> that
> > > is
> > > > >> > applied.
> > > > >> > >
> > > > >> > > - If a request is throttled due to this io-thread-unit-based
> > > quota,
> > > > is
> > > > >> > the
> > > > >> > > existing queue-size metric in ClientQuotaManager incremented?
> > > > >> > >
> > > > >> > > - In the interest of providing guide line for admin to decide
> > > > >> > > io-thread-unit-based quota and for user to understand its
> impact
> > > on
> > > > >> their
> > > > >> > > traffic, would it be useful to have a metric that shows the
> > > overall
> > > > >> > > byte-rate per io-thread-unit? Can we also show this a
> per-clientId
> > > > >> > metric?
> > > > >> > >
> > > > >> > > Thanks,
> > > > >> > > Dong
> > > > >> > >
> > > > >> > >
> > > > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io>
> > > wrote:
> > > > >> > >
> > > > >> > >> Hi, Ismael,
> > > > >> > >>
> > > > >> > >> For #3, typically, an admin won't configure more io threads
> than
> > > > CPU
> > > > >> > >> cores,
> > > > >> > >> but it's possible for an admin to start with fewer io threads
> > > than
> > > > >> cores
> > > > >> > >> and grow that later on.
> > > > >> > >>
> > > > >> > >> Hi, Dong,
> > > > >> > >>
> > > > >> > >> I think the throttleTime sensor on the broker tells the admin
> > > > >> whether a
> > > > >> > >> user/clentId is throttled or not.
> > > > >> > >>
> > > > >> > >> Hi, Radi,
> > > > >> > >>
> > > > >> > >> The reasoning for delaying the throttled requests on the
> broker
> > > > >> instead
> > > > >> > of
> > > > >> > >> returning an error immediately is that the latter has no way
> to
> > > > >> prevent
> > > > >> > >> the
> > > > >> > >> client from retrying immediately, which will make things
> worse.
> > > The
> > > > >> > >> delaying logic is based off a delay queue. A separate
> expiration
> > > > >> thread
> > > > >> > >> just waits on the next to be expired request. So, it doesn't
> tie
> > > > up a
> > > > >> > >> request handler thread.
> > > > >> > >>
> > > > >> > >> Thanks,
> > > > >> > >>
> > > > >> > >> Jun
> > > > >> > >>
> > > > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> ismael@juma.me.uk>
> > > > >> wrote:
> > > > >> > >>
> > > > >> > >> > Hi Jay,
> > > > >> > >> >
> > > > >> > >> > Regarding 1, I definitely like the simplicity of keeping a
> > > single
> > > > >> > >> throttle
> > > > >> > >> > time field in the response. The downside is that the client
> > > > metrics
> > > > >> > >> will be
> > > > >> > >> > more coarse grained.
> > > > >> > >> >
> > > > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.
> percentage`
> > > > and
> > > > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > >> > >> >
> > > > >> > >> > Ismael
> > > > >> > >> >
> > > > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> jay@confluent.io>
> > > > >> wrote:
> > > > >> > >> >
> > > > >> > >> > > A few minor comments:
> > > > >> > >> > >
> > > > >> > >> > >    1. Isn't it the case that the throttling time response
> > > field
> > > > >> > should
> > > > >> > >> > have
> > > > >> > >> > >    the total time your request was throttled
> irrespective of
> > > > the
> > > > >> > >> quotas
> > > > >> > >> > > that
> > > > >> > >> > >    caused that. Limiting it to byte rate quota doesn't
> make
> > > > >> sense,
> > > > >> > >> but I
> > > > >> > >> > > also
> > > > >> > >> > >    I don't think we want to end up adding new fields in
> the
> > > > >> response
> > > > >> > >> for
> > > > >> > >> > > every
> > > > >> > >> > >    single thing we quota, right?
> > > > >> > >> > >    2. I don't think we should make this quota
> specifically
> > > > about
> > > > >> io
> > > > >> > >> > >    threads. Once we introduce these quotas people set
> them
> > > and
> > > > >> > expect
> > > > >> > >> > them
> > > > >> > >> > > to
> > > > >> > >> > >    be enforced (and if they aren't it may cause an
> outage).
> > > As
> > > > a
> > > > >> > >> result
> > > > >> > >> > > they
> > > > >> > >> > >    are a bit more sensitive than normal configs, I
> think. The
> > > > >> > current
> > > > >> > >> > > thread
> > > > >> > >> > >    pools seem like something of an implementation detail
> and
> > > > not
> > > > >> the
> > > > >> > >> > level
> > > > >> > >> > > the
> > > > >> > >> > >    user-facing quotas should be involved with. I think it
> > > might
> > > > >> be
> > > > >> > >> better
> > > > >> > >> > > to
> > > > >> > >> > >    make this a general request-time throttle with no
> mention
> > > in
> > > > >> the
> > > > >> > >> > naming
> > > > >> > >> > >    about I/O threads and simply acknowledge the current
> > > > >> limitation
> > > > >> > >> (which
> > > > >> > >> > > we
> > > > >> > >> > >    may someday fix) in the docs that this covers only the
> > > time
> > > > >> after
> > > > >> > >> the
> > > > >> > >> > >    thread is read off the network.
> > > > >> > >> > >    3. As such I think the right interface to the user
> would
> > > be
> > > > >> > >> something
> > > > >> > >> > >    like percent_request_time and be in {0,...100} or
> > > > >> > >> request_time_ratio
> > > > >> > >> > > and be
> > > > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology
> we
> > > used
> > > > >> if
> > > > >> > the
> > > > >> > >> > > scale
> > > > >> > >> > >    is between 0 and 1 in the other metrics, right?)
> > > > >> > >> > >
> > > > >> > >> > > -Jay
> > > > >> > >> > >
> > > > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > > > >> > >> rajinisivaram@gmail.com
> > > > >> > >> > >
> > > > >> > >> > > wrote:
> > > > >> > >> > >
> > > > >> > >> > > > Guozhang/Dong,
> > > > >> > >> > > >
> > > > >> > >> > > > Thank you for the feedback.
> > > > >> > >> > > >
> > > > >> > >> > > > Guozhang : I have updated the section on co-existence
> of
> > > byte
> > > > >> rate
> > > > >> > >> and
> > > > >> > >> > > > request time quotas.
> > > > >> > >> > > >
> > > > >> > >> > > > Dong: I hadn't added much detail to the metrics and
> sensors
> > > > >> since
> > > > >> > >> they
> > > > >> > >> > > are
> > > > >> > >> > > > going to be very similar to the existing metrics and
> > > sensors.
> > > > >> To
> > > > >> > >> avoid
> > > > >> > >> > > > confusion, I have now added more detail. All metrics
> are in
> > > > the
> > > > >> > >> group
> > > > >> > >> > > > "quotaType" and all sensors have names starting with
> > > > >> "quotaType"
> > > > >> > >> (where
> > > > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > > > >> > >> > > > FollowerReplication/*IOThread*).
> > > > >> > >> > > > So there will be no reuse of existing metrics/sensors.
> The
> > > > new
> > > > >> > ones
> > > > >> > >> for
> > > > >> > >> > > > request processing time based throttling will be
> completely
> > > > >> > >> independent
> > > > >> > >> > > of
> > > > >> > >> > > > existing metrics/sensors, but will be consistent in
> format.
> > > > >> > >> > > >
> > > > >> > >> > > > The existing throttle_time_ms field in produce/fetch
> > > > responses
> > > > >> > will
> > > > >> > >> not
> > > > >> > >> > > be
> > > > >> > >> > > > impacted by this KIP. That will continue to return
> > > byte-rate
> > > > >> based
> > > > >> > >> > > > throttling times. In addition, a new field
> > > > >> > request_throttle_time_ms
> > > > >> > >> > will
> > > > >> > >> > > be
> > > > >> > >> > > > added to return request quota based throttling times.
> These
> > > > >> will
> > > > >> > be
> > > > >> > >> > > exposed
> > > > >> > >> > > > as new metrics on the client-side.
> > > > >> > >> > > >
> > > > >> > >> > > > Since all metrics and sensors are different for each
> type
> > > of
> > > > >> > quota,
> > > > >> > >> I
> > > > >> > >> > > > believe there is already sufficient metrics to monitor
> > > > >> throttling
> > > > >> > on
> > > > >> > >> > both
> > > > >> > >> > > > client and broker side for each type of throttling.
> > > > >> > >> > > >
> > > > >> > >> > > > Regards,
> > > > >> > >> > > >
> > > > >> > >> > > > Rajini
> > > > >> > >> > > >
> > > > >> > >> > > >
> > > > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > > > lindong28@gmail.com
> > > > >> >
> > > > >> > >> wrote:
> > > > >> > >> > > >
> > > > >> > >> > > > > Hey Rajini,
> > > > >> > >> > > > >
> > > > >> > >> > > > > I think it makes a lot of sense to use
> io_thread_units as
> > > > >> metric
> > > > >> > >> to
> > > > >> > >> > > quota
> > > > >> > >> > > > > user's traffic here. LGTM overall. I have some
> questions
> > > > >> > regarding
> > > > >> > >> > > > sensors.
> > > > >> > >> > > > >
> > > > >> > >> > > > > - Can you be more specific in the KIP what sensors
> will
> > > be
> > > > >> > added?
> > > > >> > >> For
> > > > >> > >> > > > > example, it will be useful to specify the name and
> > > > >> attributes of
> > > > >> > >> > these
> > > > >> > >> > > > new
> > > > >> > >> > > > > sensors.
> > > > >> > >> > > > >
> > > > >> > >> > > > > - We currently have throttle-time and queue-size for
> > > > >> byte-rate
> > > > >> > >> based
> > > > >> > >> > > > quota.
> > > > >> > >> > > > > Are you going to have separate throttle-time and
> > > queue-size
> > > > >> for
> > > > >> > >> > > requests
> > > > >> > >> > > > > throttled by io_thread_unit-based quota, or will they
> > > share
> > > > >> the
> > > > >> > >> same
> > > > >> > >> > > > > sensor?
> > > > >> > >> > > > >
> > > > >> > >> > > > > - Does the throttle-time in the ProduceResponse and
> > > > >> > FetchResponse
> > > > >> > >> > > > contains
> > > > >> > >> > > > > time due to io_thread_unit-based quota?
> > > > >> > >> > > > >
> > > > >> > >> > > > > - Currently kafka server doesn't not provide any log
> or
> > > > >> metrics
> > > > >> > >> that
> > > > >> > >> > > > tells
> > > > >> > >> > > > > whether any given clientId (or user) is throttled.
> This
> > > is
> > > > >> not
> > > > >> > too
> > > > >> > >> > bad
> > > > >> > >> > > > > because we can still check the client-side byte-rate
> > > metric
> > > > >> to
> > > > >> > >> > validate
> > > > >> > >> > > > > whether a given client is throttled. But with this
> > > > >> > io_thread_unit,
> > > > >> > >> > > there
> > > > >> > >> > > > > will be no way to validate whether a given client is
> slow
> > > > >> > because
> > > > >> > >> it
> > > > >> > >> > > has
> > > > >> > >> > > > > exceeded its io_thread_unit limit. It is necessary
> for
> > > user
> > > > >> to
> > > > >> > be
> > > > >> > >> > able
> > > > >> > >> > > to
> > > > >> > >> > > > > know this information to figure how whether they have
> > > > reached
> > > > >> > >> there
> > > > >> > >> > > quota
> > > > >> > >> > > > > limit. How about we add log4j log on the server side
> to
> > > > >> > >> periodically
> > > > >> > >> > > > print
> > > > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > > > >> > >> > io-thread-unit-throttle-time)
> > > > >> > >> > > so
> > > > >> > >> > > > > that kafka administrator can figure those users that
> have
> > > > >> > reached
> > > > >> > >> > their
> > > > >> > >> > > > > limit and act accordingly?
> > > > >> > >> > > > >
> > > > >> > >> > > > > Thanks,
> > > > >> > >> > > > > Dong
> > > > >> > >> > > > >
> > > > >> > >> > > > >
> > > > >> > >> > > > >
> > > > >> > >> > > > >
> > > > >> > >> > > > >
> > > > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> > > > >> > >> wangguoz@gmail.com>
> > > > >> > >> > > > wrote:
> > > > >> > >> > > > >
> > > > >> > >> > > > > > Made a pass over the doc, overall LGTM except a
> minor
> > > > >> comment
> > > > >> > on
> > > > >> > >> > the
> > > > >> > >> > > > > > throttling implementation:
> > > > >> > >> > > > > >
> > > > >> > >> > > > > > Stated as "Request processing time throttling will
> be
> > > > >> applied
> > > > >> > on
> > > > >> > >> > top
> > > > >> > >> > > if
> > > > >> > >> > > > > > necessary." I thought that it meant the request
> > > > processing
> > > > >> > time
> > > > >> > >> > > > > throttling
> > > > >> > >> > > > > > is applied first, but continue reading I found it
> > > > actually
> > > > >> > >> meant to
> > > > >> > >> > > > apply
> > > > >> > >> > > > > > produce / fetch byte rate throttling first.
> > > > >> > >> > > > > >
> > > > >> > >> > > > > > Also the last sentence "The remaining delay if any
> is
> > > > >> applied
> > > > >> > to
> > > > >> > >> > the
> > > > >> > >> > > > > > response." is a bit confusing to me. Maybe
> rewording
> > > it a
> > > > >> bit?
> > > > >> > >> > > > > >
> > > > >> > >> > > > > >
> > > > >> > >> > > > > > Guozhang
> > > > >> > >> > > > > >
> > > > >> > >> > > > > >
> > > > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> > > > jun@confluent.io
> > > > >> >
> > > > >> > >> wrote:
> > > > >> > >> > > > > >
> > > > >> > >> > > > > > > Hi, Rajini,
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > > > Thanks for the updated KIP. The latest proposal
> looks
> > > > >> good
> > > > >> > to
> > > > >> > >> me.
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > > > Jun
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > > > >> > >> > > > > rajinisivaram@gmail.com
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > > > wrote:
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > > > > Jun/Roger,
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > > > Thank you for the feedback.
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > > > 1. I have updated the KIP to use absolute units
> > > > >> instead of
> > > > >> > >> > > > > percentage.
> > > > >> > >> > > > > > > The
> > > > >> > >> > > > > > > > property is called* io_thread_units* to align
> with
> > > > the
> > > > >> > >> thread
> > > > >> > >> > > count
> > > > >> > >> > > > > > > > property *num.io.threads*. When we implement
> > > network
> > > > >> > thread
> > > > >> > >> > > > > utilization
> > > > >> > >> > > > > > > > quotas, we can add another property
> > > > >> > *network_thread_units.*
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > > > 2. ControlledShutdown is already listed under
> the
> > > > >> exempt
> > > > >> > >> > > requests.
> > > > >> > >> > > > > Jun,
> > > > >> > >> > > > > > > did
> > > > >> > >> > > > > > > > you mean a different request that needs to be
> > > added?
> > > > >> The
> > > > >> > >> four
> > > > >> > >> > > > > requests
> > > > >> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> > > > >> > >> > ControlledShutdown,
> > > > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are
> > > controlled
> > > > >> > using
> > > > >> > >> > > > > > ClusterAction
> > > > >> > >> > > > > > > > ACL, so it is easy to exclude and only
> throttle if
> > > > >> > >> > unauthorized.
> > > > >> > >> > > I
> > > > >> > >> > > > > > wasn't
> > > > >> > >> > > > > > > > sure if there are other requests used only for
> > > > >> > inter-broker
> > > > >> > >> > that
> > > > >> > >> > > > > needed
> > > > >> > >> > > > > > > to
> > > > >> > >> > > > > > > > be excluded.
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > > > 3. I was thinking the smallest change would be
> to
> > > > >> replace
> > > > >> > >> all
> > > > >> > >> > > > > > references
> > > > >> > >> > > > > > > to
> > > > >> > >> > > > > > > > *requestChannel.sendResponse()* with a local
> > > method
> > > > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the
> > > > throttling
> > > > >> if
> > > > >> > >> any
> > > > >> > >> > > plus
> > > > >> > >> > > > > send
> > > > >> > >> > > > > > > > response. If we throttle first in
> > > > *KafkaApis.handle()*,
> > > > >> > the
> > > > >> > >> > time
> > > > >> > >> > > > > spent
> > > > >> > >> > > > > > > > within the method handling the request will
> not be
> > > > >> > recorded
> > > > >> > >> or
> > > > >> > >> > > used
> > > > >> > >> > > > > in
> > > > >> > >> > > > > > > > throttling. We can look into this again when
> the PR
> > > > is
> > > > >> > ready
> > > > >> > >> > for
> > > > >> > >> > > > > > review.
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > > > Regards,
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > > > Rajini
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > > > >> > >> > > > > roger.hoover@gmail.com>
> > > > >> > >> > > > > > > > wrote:
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > > > > Great to see this KIP and the excellent
> > > discussion.
> > > > >> > >> > > > > > > > >
> > > > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
> > > > >> application
> > > > >> > is
> > > > >> > >> > > > > allocated
> > > > >> > >> > > > > > 1
> > > > >> > >> > > > > > > > > request handler unit, then it's as if I have
> a
> > > > Kafka
> > > > >> > >> broker
> > > > >> > >> > > with
> > > > >> > >> > > > a
> > > > >> > >> > > > > > > single
> > > > >> > >> > > > > > > > > request handler thread dedicated to me.
> That's
> > > the
> > > > >> > most I
> > > > >> > >> > can
> > > > >> > >> > > > use,
> > > > >> > >> > > > > > at
> > > > >> > >> > > > > > > > > least.  That allocation doesn't change even
> if an
> > > > >> admin
> > > > >> > >> later
> > > > >> > >> > > > > > increases
> > > > >> > >> > > > > > > > the
> > > > >> > >> > > > > > > > > size of the request thread pool on the
> broker.
> > > > It's
> > > > >> > >> similar
> > > > >> > >> > to
> > > > >> > >> > > > the
> > > > >> > >> > > > > > CPU
> > > > >> > >> > > > > > > > > abstraction that VMs and containers get from
> > > > >> hypervisors
> > > > >> > >> or
> > > > >> > >> > OS
> > > > >> > >> > > > > > > > schedulers.
> > > > >> > >> > > > > > > > > While different client access patterns can
> use
> > > > wildly
> > > > >> > >> > different
> > > > >> > >> > > > > > amounts
> > > > >> > >> > > > > > > > of
> > > > >> > >> > > > > > > > > request thread resources per request, a given
> > > > >> > application
> > > > >> > >> > will
> > > > >> > >> > > > > > > generally
> > > > >> > >> > > > > > > > > have a stable access pattern and can figure
> out
> > > > >> > >> empirically
> > > > >> > >> > how
> > > > >> > >> > > > > many
> > > > >> > >> > > > > > > > > "request thread units" it needs to meet it's
> > > > >> > >> > throughput/latency
> > > > >> > >> > > > > > goals.
> > > > >> > >> > > > > > > > >
> > > > >> > >> > > > > > > > > Cheers,
> > > > >> > >> > > > > > > > >
> > > > >> > >> > > > > > > > > Roger
> > > > >> > >> > > > > > > > >
> > > > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> > > > >> > >> jun@confluent.io>
> > > > >> > >> > > > wrote:
> > > > >> > >> > > > > > > > >
> > > > >> > >> > > > > > > > > > Hi, Rajini,
> > > > >> > >> > > > > > > > > >
> > > > >> > >> > > > > > > > > > Thanks for the updated KIP. A few more
> > > comments.
> > > > >> > >> > > > > > > > > >
> > > > >> > >> > > > > > > > > > 1. A concern of request_time_percent is
> that
> > > it's
> > > > >> not
> > > > >> > an
> > > > >> > >> > > > absolute
> > > > >> > >> > > > > > > > value.
> > > > >> > >> > > > > > > > > > Let's say you give a user a 10% limit. If
> the
> > > > admin
> > > > >> > >> doubles
> > > > >> > >> > > the
> > > > >> > >> > > > > > > number
> > > > >> > >> > > > > > > > of
> > > > >> > >> > > > > > > > > > request handler threads, that user now
> actually
> > > > has
> > > > >> > >> twice
> > > > >> > >> > the
> > > > >> > >> > > > > > > absolute
> > > > >> > >> > > > > > > > > > capacity. This may confuse people a bit.
> So,
> > > > >> perhaps
> > > > >> > >> > setting
> > > > >> > >> > > > the
> > > > >> > >> > > > > > > quota
> > > > >> > >> > > > > > > > > > based on an absolute request thread unit is
> > > > better.
> > > > >> > >> > > > > > > > > >
> > > > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
> > > > >> inter-broker
> > > > >> > >> > request
> > > > >> > >> > > > and
> > > > >> > >> > > > > > > needs
> > > > >> > >> > > > > > > > to
> > > > >> > >> > > > > > > > > > be excluded from throttling.
> > > > >> > >> > > > > > > > > >
> > > > >> > >> > > > > > > > > > 3. Implementation wise, I am wondering if
> it's
> > > > >> simpler
> > > > >> > >> to
> > > > >> > >> > > apply
> > > > >> > >> > > > > the
> > > > >> > >> > > > > > > > > request
> > > > >> > >> > > > > > > > > > time throttling first in
> KafkaApis.handle().
> > > > >> > Otherwise,
> > > > >> > >> we
> > > > >> > >> > > will
> > > > >> > >> > > > > > need
> > > > >> > >> > > > > > > to
> > > > >> > >> > > > > > > > > add
> > > > >> > >> > > > > > > > > > the throttling logic in each type of
> request.
> > > > >> > >> > > > > > > > > >
> > > > >> > >> > > > > > > > > > Thanks,
> > > > >> > >> > > > > > > > > >
> > > > >> > >> > > > > > > > > > Jun
> > > > >> > >> > > > > > > > > >
> > > > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini
> > > Sivaram <
> > > > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > >> > >> > > > > > > > > >
> > > > >> > >> > > > > > > > > > wrote:
> > > > >> > >> > > > > > > > > >
> > > > >> > >> > > > > > > > > > > Jun,
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > Thank you for the review.
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > I have reverted to the original KIP that
> > > > >> throttles
> > > > >> > >> based
> > > > >> > >> > on
> > > > >> > >> > > > > > request
> > > > >> > >> > > > > > > > > > handler
> > > > >> > >> > > > > > > > > > > utilization. At the moment, it uses
> > > percentage,
> > > > >> but
> > > > >> > I
> > > > >> > >> am
> > > > >> > >> > > > happy
> > > > >> > >> > > > > to
> > > > >> > >> > > > > > > > > change
> > > > >> > >> > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
> > > > >> required. I
> > > > >> > >> have
> > > > >> > >> > > > added
> > > > >> > >> > > > > > the
> > > > >> > >> > > > > > > > > > examples
> > > > >> > >> > > > > > > > > > > from this discussion to the KIP. Also
> added a
> > > > >> > "Future
> > > > >> > >> > Work"
> > > > >> > >> > > > > > section
> > > > >> > >> > > > > > > > to
> > > > >> > >> > > > > > > > > > > address network thread utilization. The
> > > > >> > configuration
> > > > >> > >> is
> > > > >> > >> > > > named
> > > > >> > >> > > > > > > > > > > "request_time_percent" with the
> expectation
> > > > that
> > > > >> it
> > > > >> > >> can
> > > > >> > >> > > also
> > > > >> > >> > > > be
> > > > >> > >> > > > > > > used
> > > > >> > >> > > > > > > > as
> > > > >> > >> > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > limit for network thread utilization when
> > > that
> > > > is
> > > > >> > >> > > > implemented,
> > > > >> > >> > > > > so
> > > > >> > >> > > > > > > > that
> > > > >> > >> > > > > > > > > > > users have to set only one config for
> the two
> > > > and
> > > > >> > not
> > > > >> > >> > have
> > > > >> > >> > > to
> > > > >> > >> > > > > > worry
> > > > >> > >> > > > > > > > > about
> > > > >> > >> > > > > > > > > > > the internal distribution of the work
> between
> > > > the
> > > > >> > two
> > > > >> > >> > > thread
> > > > >> > >> > > > > > pools
> > > > >> > >> > > > > > > in
> > > > >> > >> > > > > > > > > > > Kafka.
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > Regards,
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > Rajini
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun
> Rao <
> > > > >> > >> > > jun@confluent.io>
> > > > >> > >> > > > > > > wrote:
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > The benefit of using the request
> processing
> > > > >> time
> > > > >> > >> over
> > > > >> > >> > the
> > > > >> > >> > > > > > request
> > > > >> > >> > > > > > > > > rate
> > > > >> > >> > > > > > > > > > is
> > > > >> > >> > > > > > > > > > > > exactly what people have said. I will
> just
> > > > >> expand
> > > > >> > >> that
> > > > >> > >> > a
> > > > >> > >> > > > bit.
> > > > >> > >> > > > > > > > > Consider
> > > > >> > >> > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > following case. The producer sends a
> > > produce
> > > > >> > request
> > > > >> > >> > > with a
> > > > >> > >> > > > > > 10MB
> > > > >> > >> > > > > > > > > > message
> > > > >> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> > > > >> > >> decompression of
> > > > >> > >> > > the
> > > > >> > >> > > > > > > message
> > > > >> > >> > > > > > > > > on
> > > > >> > >> > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > broker could take 10-15 seconds, during
> > > which
> > > > >> > time,
> > > > >> > >> a
> > > > >> > >> > > > request
> > > > >> > >> > > > > > > > handler
> > > > >> > >> > > > > > > > > > > > thread is completely blocked. In this
> case,
> > > > >> > neither
> > > > >> > >> the
> > > > >> > >> > > > > byte-in
> > > > >> > >> > > > > > > > quota
> > > > >> > >> > > > > > > > > > nor
> > > > >> > >> > > > > > > > > > > > the request rate quota may be
> effective in
> > > > >> > >> protecting
> > > > >> > >> > the
> > > > >> > >> > > > > > broker.
> > > > >> > >> > > > > > > > > > > Consider
> > > > >> > >> > > > > > > > > > > > another case. A consumer group starts
> with
> > > 10
> > > > >> > >> instances
> > > > >> > >> > > and
> > > > >> > >> > > > > > later
> > > > >> > >> > > > > > > > on
> > > > >> > >> > > > > > > > > > > > switches to 20 instances. The request
> rate
> > > > will
> > > > >> > >> likely
> > > > >> > >> > > > > double,
> > > > >> > >> > > > > > > but
> > > > >> > >> > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > actually load on the broker may not
> double
> > > > >> since
> > > > >> > >> each
> > > > >> > >> > > fetch
> > > > >> > >> > > > > > > request
> > > > >> > >> > > > > > > > > > only
> > > > >> > >> > > > > > > > > > > > contains half of the partitions.
> Request
> > > rate
> > > > >> > quota
> > > > >> > >> may
> > > > >> > >> > > not
> > > > >> > >> > > > > be
> > > > >> > >> > > > > > > easy
> > > > >> > >> > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > configure in this case.
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > What we really want is to be able to
> > > prevent
> > > > a
> > > > >> > >> client
> > > > >> > >> > > from
> > > > >> > >> > > > > > using
> > > > >> > >> > > > > > > > too
> > > > >> > >> > > > > > > > > > much
> > > > >> > >> > > > > > > > > > > > of the server side resources. In this
> > > > >> particular
> > > > >> > >> KIP,
> > > > >> > >> > > this
> > > > >> > >> > > > > > > resource
> > > > >> > >> > > > > > > > > is
> > > > >> > >> > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > capacity of the request handler
> threads. I
> > > > >> agree
> > > > >> > >> that
> > > > >> > >> > it
> > > > >> > >> > > > may
> > > > >> > >> > > > > > not
> > > > >> > >> > > > > > > be
> > > > >> > >> > > > > > > > > > > > intuitive for the users to determine
> how to
> > > > set
> > > > >> > the
> > > > >> > >> > right
> > > > >> > >> > > > > > limit.
> > > > >> > >> > > > > > > > > > However,
> > > > >> > >> > > > > > > > > > > > this is not completely new and has been
> > > done
> > > > in
> > > > >> > the
> > > > >> > >> > > > container
> > > > >> > >> > > > > > > world
> > > > >> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> > > > >> > >> > > > > https://access.redhat.com/
> > > > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > >> > >> terprise_Linux/6/html/
> > > > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-
> cpu.html)
> > > has
> > > > >> the
> > > > >> > >> > concept
> > > > >> > >> > > of
> > > > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > >> > >> > > > > > > > > > > > which specifies the total amount of
> time in
> > > > >> > >> > microseconds
> > > > >> > >> > > > for
> > > > >> > >> > > > > > > which
> > > > >> > >> > > > > > > > > all
> > > > >> > >> > > > > > > > > > > > tasks in a cgroup can run during a one
> > > second
> > > > >> > >> period.
> > > > >> > >> > We
> > > > >> > >> > > > can
> > > > >> > >> > > > > > > > > > potentially
> > > > >> > >> > > > > > > > > > > > model the request handler threads in a
> > > > similar
> > > > >> > way.
> > > > >> > >> For
> > > > >> > >> > > > > > example,
> > > > >> > >> > > > > > > > each
> > > > >> > >> > > > > > > > > > > > request handler thread can be 1 request
> > > > handler
> > > > >> > unit
> > > > >> > >> > and
> > > > >> > >> > > > the
> > > > >> > >> > > > > > > admin
> > > > >> > >> > > > > > > > > can
> > > > >> > >> > > > > > > > > > > > configure a limit on how many units
> (say
> > > > 0.01)
> > > > >> a
> > > > >> > >> client
> > > > >> > >> > > can
> > > > >> > >> > > > > > have.
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > Regarding not throttling the internal
> > > broker
> > > > to
> > > > >> > >> broker
> > > > >> > >> > > > > > requests.
> > > > >> > >> > > > > > > We
> > > > >> > >> > > > > > > > > > could
> > > > >> > >> > > > > > > > > > > > do that. Alternatively, we could just
> let
> > > the
> > > > >> > admin
> > > > >> > >> > > > > configure a
> > > > >> > >> > > > > > > > high
> > > > >> > >> > > > > > > > > > > limit
> > > > >> > >> > > > > > > > > > > > for the kafka user (it may not be able
> to
> > > do
> > > > >> that
> > > > >> > >> > easily
> > > > >> > >> > > > > based
> > > > >> > >> > > > > > on
> > > > >> > >> > > > > > > > > > > clientId
> > > > >> > >> > > > > > > > > > > > though).
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > Ideally we want to be able to protect
> the
> > > > >> > >> utilization
> > > > >> > >> > of
> > > > >> > >> > > > the
> > > > >> > >> > > > > > > > network
> > > > >> > >> > > > > > > > > > > thread
> > > > >> > >> > > > > > > > > > > > pool too. The difficult is mostly what
> > > Rajini
> > > > >> > said:
> > > > >> > >> (1)
> > > > >> > >> > > The
> > > > >> > >> > > > > > > > mechanism
> > > > >> > >> > > > > > > > > > for
> > > > >> > >> > > > > > > > > > > > throttling the requests is through
> > > Purgatory
> > > > >> and
> > > > >> > we
> > > > >> > >> > will
> > > > >> > >> > > > have
> > > > >> > >> > > > > > to
> > > > >> > >> > > > > > > > > think
> > > > >> > >> > > > > > > > > > > > through how to integrate that into the
> > > > network
> > > > >> > >> layer.
> > > > >> > >> > > (2)
> > > > >> > >> > > > In
> > > > >> > >> > > > > > the
> > > > >> > >> > > > > > > > > > network
> > > > >> > >> > > > > > > > > > > > layer, currently we know the user, but
> not
> > > > the
> > > > >> > >> clientId
> > > > >> > >> > > of
> > > > >> > >> > > > > the
> > > > >> > >> > > > > > > > > request.
> > > > >> > >> > > > > > > > > > > So,
> > > > >> > >> > > > > > > > > > > > it's a bit tricky to throttle based on
> > > > clientId
> > > > >> > >> there.
> > > > >> > >> > > > Plus,
> > > > >> > >> > > > > > the
> > > > >> > >> > > > > > > > > > byteOut
> > > > >> > >> > > > > > > > > > > > quota can already protect the network
> > > thread
> > > > >> > >> > utilization
> > > > >> > >> > > > for
> > > > >> > >> > > > > > > fetch
> > > > >> > >> > > > > > > > > > > > requests. So, if we can't figure out
> this
> > > > part
> > > > >> > right
> > > > >> > >> > now,
> > > > >> > >> > > > > just
> > > > >> > >> > > > > > > > > focusing
> > > > >> > >> > > > > > > > > > > on
> > > > >> > >> > > > > > > > > > > > the request handling threads for this
> KIP
> > > is
> > > > >> > still a
> > > > >> > >> > > useful
> > > > >> > >> > > > > > > > feature.
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > Thanks,
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > Jun
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini
> > > > >> Sivaram <
> > > > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > wrote:
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > Thank you all for the feedback.
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > Jay: I have removed exemption for
> > > consumer
> > > > >> > >> heartbeat
> > > > >> > >> > > etc.
> > > > >> > >> > > > > > Agree
> > > > >> > >> > > > > > > > > that
> > > > >> > >> > > > > > > > > > > > > protecting the cluster is more
> important
> > > > than
> > > > >> > >> > > protecting
> > > > >> > >> > > > > > > > individual
> > > > >> > >> > > > > > > > > > > apps.
> > > > >> > >> > > > > > > > > > > > > Have retained the exemption for
> > > > >> > >> > > StopReplicat/LeaderAndIsr
> > > > >> > >> > > > > > etc,
> > > > >> > >> > > > > > > > > these
> > > > >> > >> > > > > > > > > > > are
> > > > >> > >> > > > > > > > > > > > > throttled only if authorization
> fails (so
> > > > >> can't
> > > > >> > be
> > > > >> > >> > used
> > > > >> > >> > > > for
> > > > >> > >> > > > > > DoS
> > > > >> > >> > > > > > > > > > attacks
> > > > >> > >> > > > > > > > > > > > in
> > > > >> > >> > > > > > > > > > > > > a secure cluster, but allows
> inter-broker
> > > > >> > >> requests to
> > > > >> > >> > > > > > complete
> > > > >> > >> > > > > > > > > > without
> > > > >> > >> > > > > > > > > > > > > delays).
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > I will wait another day to see if
> these
> > > is
> > > > >> any
> > > > >> > >> > > objection
> > > > >> > >> > > > to
> > > > >> > >> > > > > > > > quotas
> > > > >> > >> > > > > > > > > > > based
> > > > >> > >> > > > > > > > > > > > on
> > > > >> > >> > > > > > > > > > > > > request processing time (as opposed
> to
> > > > >> request
> > > > >> > >> rate)
> > > > >> > >> > > and
> > > > >> > >> > > > if
> > > > >> > >> > > > > > > there
> > > > >> > >> > > > > > > > > are
> > > > >> > >> > > > > > > > > > > no
> > > > >> > >> > > > > > > > > > > > > objections, I will revert to the
> original
> > > > >> > proposal
> > > > >> > >> > with
> > > > >> > >> > > > > some
> > > > >> > >> > > > > > > > > changes.
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > The original proposal was only
> including
> > > > the
> > > > >> > time
> > > > >> > >> > used
> > > > >> > >> > > by
> > > > >> > >> > > > > the
> > > > >> > >> > > > > > > > > request
> > > > >> > >> > > > > > > > > > > > > handler threads (that made
> calculation
> > > > >> easy). I
> > > > >> > >> think
> > > > >> > >> > > the
> > > > >> > >> > > > > > > > > suggestion
> > > > >> > >> > > > > > > > > > is
> > > > >> > >> > > > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > > include the time spent in the network
> > > > >> threads as
> > > > >> > >> well
> > > > >> > >> > > > since
> > > > >> > >> > > > > > > that
> > > > >> > >> > > > > > > > > may
> > > > >> > >> > > > > > > > > > be
> > > > >> > >> > > > > > > > > > > > > significant. As Jay pointed out, it
> is
> > > more
> > > > >> > >> > complicated
> > > > >> > >> > > > to
> > > > >> > >> > > > > > > > > calculate
> > > > >> > >> > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > total available CPU time and convert
> to a
> > > > >> ratio
> > > > >> > >> when
> > > > >> > >> > > > there
> > > > >> > >> > > > > > *m*
> > > > >> > >> > > > > > > > I/O
> > > > >> > >> > > > > > > > > > > > threads
> > > > >> > >> > > > > > > > > > > > > and *n* network threads.
> > > > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > >> > >> > > )
> > > > >> > >> > > > > may
> > > > >> > >> > > > > > > > give
> > > > >> > >> > > > > > > > > us
> > > > >> > >> > > > > > > > > > > > what
> > > > >> > >> > > > > > > > > > > > > we want, but it can be very
> expensive on
> > > > some
> > > > >> > >> > > platforms.
> > > > >> > >> > > > As
> > > > >> > >> > > > > > > > Becket
> > > > >> > >> > > > > > > > > > and
> > > > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have
> > > > several
> > > > >> > time
> > > > >> > >> > > > > > measurements
> > > > >> > >> > > > > > > > > > already
> > > > >> > >> > > > > > > > > > > > for
> > > > >> > >> > > > > > > > > > > > > generating metrics that we could use,
> > > > though
> > > > >> we
> > > > >> > >> might
> > > > >> > >> > > > want
> > > > >> > >> > > > > to
> > > > >> > >> > > > > > > > > switch
> > > > >> > >> > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > > nanoTime() instead of
> currentTimeMillis()
> > > > >> since
> > > > >> > >> some
> > > > >> > >> > of
> > > > >> > >> > > > the
> > > > >> > >> > > > > > > > values
> > > > >> > >> > > > > > > > > > for
> > > > >> > >> > > > > > > > > > > > > small requests may be < 1ms. But
> rather
> > > > than
> > > > >> add
> > > > >> > >> up
> > > > >> > >> > the
> > > > >> > >> > > > > time
> > > > >> > >> > > > > > > > spent
> > > > >> > >> > > > > > > > > in
> > > > >> > >> > > > > > > > > > > I/O
> > > > >> > >> > > > > > > > > > > > > thread and network thread, wouldn't
> it be
> > > > >> better
> > > > >> > >> to
> > > > >> > >> > > > convert
> > > > >> > >> > > > > > the
> > > > >> > >> > > > > > > > > time
> > > > >> > >> > > > > > > > > > > > spent
> > > > >> > >> > > > > > > > > > > > > on each thread into a separate ratio?
> > > UserA
> > > > >> has
> > > > >> > a
> > > > >> > >> > > request
> > > > >> > >> > > > > > quota
> > > > >> > >> > > > > > > > of
> > > > >> > >> > > > > > > > > > 5%.
> > > > >> > >> > > > > > > > > > > > Can
> > > > >> > >> > > > > > > > > > > > > we take that to mean that UserA can
> use
> > > 5%
> > > > of
> > > > >> > the
> > > > >> > >> > time
> > > > >> > >> > > on
> > > > >> > >> > > > > > > network
> > > > >> > >> > > > > > > > > > > threads
> > > > >> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If
> > > > either
> > > > >> is
> > > > >> > >> > > exceeded,
> > > > >> > >> > > > > the
> > > > >> > >> > > > > > > > > > response
> > > > >> > >> > > > > > > > > > > is
> > > > >> > >> > > > > > > > > > > > > throttled - it would mean
> maintaining two
> > > > >> sets
> > > > >> > of
> > > > >> > >> > > metrics
> > > > >> > >> > > > > for
> > > > >> > >> > > > > > > the
> > > > >> > >> > > > > > > > > two
> > > > >> > >> > > > > > > > > > > > > durations, but would result in more
> > > > >> meaningful
> > > > >> > >> > ratios.
> > > > >> > >> > > We
> > > > >> > >> > > > > > could
> > > > >> > >> > > > > > > > > > define
> > > > >> > >> > > > > > > > > > > > two
> > > > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request
> > > > threads
> > > > >> > and
> > > > >> > >> 10%
> > > > >> > >> > > of
> > > > >> > >> > > > > > > network
> > > > >> > >> > > > > > > > > > > > threads),
> > > > >> > >> > > > > > > > > > > > > but that seems unnecessary and
> harder to
> > > > >> explain
> > > > >> > >> to
> > > > >> > >> > > > users.
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > Back to why and how quotas are
> applied to
> > > > >> > network
> > > > >> > >> > > thread
> > > > >> > >> > > > > > > > > utilization:
> > > > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the time
> spent
> > > in
> > > > >> the
> > > > >> > >> > network
> > > > >> > >> > > > > > thread
> > > > >> > >> > > > > > > > may
> > > > >> > >> > > > > > > > > be
> > > > >> > >> > > > > > > > > > > > > significant and I can see the need to
> > > > include
> > > > >> > >> this.
> > > > >> > >> > Are
> > > > >> > >> > > > > there
> > > > >> > >> > > > > > > > other
> > > > >> > >> > > > > > > > > > > > > requests where the network thread
> > > > >> utilization is
> > > > >> > >> > > > > significant?
> > > > >> > >> > > > > > > In
> > > > >> > >> > > > > > > > > the
> > > > >> > >> > > > > > > > > > > case
> > > > >> > >> > > > > > > > > > > > > of fetch, request handler thread
> > > > utilization
> > > > >> > would
> > > > >> > >> > > > throttle
> > > > >> > >> > > > > > > > clients
> > > > >> > >> > > > > > > > > > > with
> > > > >> > >> > > > > > > > > > > > > high request rate, low data volume
> and
> > > > fetch
> > > > >> > byte
> > > > >> > >> > rate
> > > > >> > >> > > > > quota
> > > > >> > >> > > > > > > will
> > > > >> > >> > > > > > > > > > > > throttle
> > > > >> > >> > > > > > > > > > > > > clients with high data volume.
> Network
> > > > thread
> > > > >> > >> > > utilization
> > > > >> > >> > > > > is
> > > > >> > >> > > > > > > > > perhaps
> > > > >> > >> > > > > > > > > > > > > proportional to the data volume. I am
> > > > >> wondering
> > > > >> > >> if we
> > > > >> > >> > > > even
> > > > >> > >> > > > > > need
> > > > >> > >> > > > > > > > to
> > > > >> > >> > > > > > > > > > > > throttle
> > > > >> > >> > > > > > > > > > > > > based on network thread utilization
> or
> > > > >> whether
> > > > >> > the
> > > > >> > >> > data
> > > > >> > >> > > > > > volume
> > > > >> > >> > > > > > > > > quota
> > > > >> > >> > > > > > > > > > > > covers
> > > > >> > >> > > > > > > > > > > > > this case.
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > b) At the moment, we record and
> check for
> > > > >> quota
> > > > >> > >> > > violation
> > > > >> > >> > > > > at
> > > > >> > >> > > > > > > the
> > > > >> > >> > > > > > > > > same
> > > > >> > >> > > > > > > > > > > > time.
> > > > >> > >> > > > > > > > > > > > > If a quota is violated, the response
> is
> > > > >> delayed.
> > > > >> > >> > Using
> > > > >> > >> > > > > Jay'e
> > > > >> > >> > > > > > > > > example
> > > > >> > >> > > > > > > > > > of
> > > > >> > >> > > > > > > > > > > > > disk reads for fetches happening in
> the
> > > > >> network
> > > > >> > >> > thread,
> > > > >> > >> > > > We
> > > > >> > >> > > > > > > can't
> > > > >> > >> > > > > > > > > > record
> > > > >> > >> > > > > > > > > > > > and
> > > > >> > >> > > > > > > > > > > > > delay a response after the disk
> reads. We
> > > > >> could
> > > > >> > >> > record
> > > > >> > >> > > > the
> > > > >> > >> > > > > > time
> > > > >> > >> > > > > > > > > spent
> > > > >> > >> > > > > > > > > > > on
> > > > >> > >> > > > > > > > > > > > > the network thread when the response
> is
> > > > >> complete
> > > > >> > >> and
> > > > >> > >> > > > > > introduce
> > > > >> > >> > > > > > > a
> > > > >> > >> > > > > > > > > > delay
> > > > >> > >> > > > > > > > > > > > for
> > > > >> > >> > > > > > > > > > > > > handling a subsequent request
> (separate
> > > out
> > > > >> > >> recording
> > > > >> > >> > > and
> > > > >> > >> > > > > > quota
> > > > >> > >> > > > > > > > > > > violation
> > > > >> > >> > > > > > > > > > > > > handling in the case of network
> thread
> > > > >> > overload).
> > > > >> > >> > Does
> > > > >> > >> > > > that
> > > > >> > >> > > > > > > make
> > > > >> > >> > > > > > > > > > sense?
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > Regards,
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > Rajini
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM,
> Becket
> > > > Qin <
> > > > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > >> > >> > > > > > > > > > > > wrote:
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the
> CPU
> > > time
> > > > >> is a
> > > > >> > >> > little
> > > > >> > >> > > > > > > tricky. I
> > > > >> > >> > > > > > > > > am
> > > > >> > >> > > > > > > > > > > > > thinking
> > > > >> > >> > > > > > > > > > > > > > that maybe we can use the existing
> > > > request
> > > > >> > >> > > statistics.
> > > > >> > >> > > > > They
> > > > >> > >> > > > > > > are
> > > > >> > >> > > > > > > > > > > already
> > > > >> > >> > > > > > > > > > > > > > very detailed so we can probably
> see
> > > the
> > > > >> > >> > approximate
> > > > >> > >> > > > CPU
> > > > >> > >> > > > > > time
> > > > >> > >> > > > > > > > > from
> > > > >> > >> > > > > > > > > > > it,
> > > > >> > >> > > > > > > > > > > > > e.g.
> > > > >> > >> > > > > > > > > > > > > > something like (total_time -
> > > > >> > >> > > > request/response_queue_time
> > > > >> > >> > > > > -
> > > > >> > >> > > > > > > > > > > > remote_time).
> > > > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > I agree with Guozhang that when a
> user
> > > is
> > > > >> > >> throttled
> > > > >> > >> > > it
> > > > >> > >> > > > is
> > > > >> > >> > > > > > > > likely
> > > > >> > >> > > > > > > > > > that
> > > > >> > >> > > > > > > > > > > > we
> > > > >> > >> > > > > > > > > > > > > > need to see if anything has went
> wrong
> > > > >> first,
> > > > >> > >> and
> > > > >> > >> > if
> > > > >> > >> > > > the
> > > > >> > >> > > > > > > users
> > > > >> > >> > > > > > > > > are
> > > > >> > >> > > > > > > > > > > well
> > > > >> > >> > > > > > > > > > > > > > behaving and just need more
> resources,
> > > we
> > > > >> will
> > > > >> > >> have
> > > > >> > >> > > to
> > > > >> > >> > > > > bump
> > > > >> > >> > > > > > > up
> > > > >> > >> > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > quota
> > > > >> > >> > > > > > > > > > > > > > for them. It is true that
> > > pre-allocating
> > > > >> CPU
> > > > >> > >> time
> > > > >> > >> > > quota
> > > > >> > >> > > > > > > > precisely
> > > > >> > >> > > > > > > > > > for
> > > > >> > >> > > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > > users is difficult. So in practice
> it
> > > > would
> > > > >> > >> > probably
> > > > >> > >> > > be
> > > > >> > >> > > > > > more
> > > > >> > >> > > > > > > > like
> > > > >> > >> > > > > > > > > > > first
> > > > >> > >> > > > > > > > > > > > > set
> > > > >> > >> > > > > > > > > > > > > > a relative high protective CPU time
> > > quota
> > > > >> for
> > > > >> > >> > > everyone
> > > > >> > >> > > > > and
> > > > >> > >> > > > > > > > > increase
> > > > >> > >> > > > > > > > > > > > that
> > > > >> > >> > > > > > > > > > > > > > for some individual clients on
> demand.
> > > > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > Thanks,
> > > > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM,
> > > Guozhang
> > > > >> > Wang <
> > > > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > wrote:
> > > > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > This is a great proposal, glad
> to see
> > > > it
> > > > >> > >> > happening.
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU
> throttling,
> > > or
> > > > >> more
> > > > >> > >> > > > > specifically
> > > > >> > >> > > > > > > > > > > processing
> > > > >> > >> > > > > > > > > > > > > time
> > > > >> > >> > > > > > > > > > > > > > > ratio instead of the request rate
> > > > >> throttling
> > > > >> > >> as
> > > > >> > >> > > well.
> > > > >> > >> > > > > > > Becket
> > > > >> > >> > > > > > > > > has
> > > > >> > >> > > > > > > > > > > very
> > > > >> > >> > > > > > > > > > > > > > well
> > > > >> > >> > > > > > > > > > > > > > > summed my rationales above, and
> one
> > > > >> thing to
> > > > >> > >> add
> > > > >> > >> > > here
> > > > >> > >> > > > > is
> > > > >> > >> > > > > > > that
> > > > >> > >> > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > former
> > > > >> > >> > > > > > > > > > > > > > > has a good support for both
> > > "protecting
> > > > >> > >> against
> > > > >> > >> > > rogue
> > > > >> > >> > > > > > > > clients"
> > > > >> > >> > > > > > > > > as
> > > > >> > >> > > > > > > > > > > > well
> > > > >> > >> > > > > > > > > > > > > as
> > > > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> > > multi-tenancy
> > > > >> > usage":
> > > > >> > >> > when
> > > > >> > >> > > > > > > thinking
> > > > >> > >> > > > > > > > > > about
> > > > >> > >> > > > > > > > > > > > how
> > > > >> > >> > > > > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > > > > explain this to the end users, I
> find
> > > > it
> > > > >> > >> actually
> > > > >> > >> > > > more
> > > > >> > >> > > > > > > > natural
> > > > >> > >> > > > > > > > > > than
> > > > >> > >> > > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > > > request rate since as mentioned
> > > above,
> > > > >> > >> different
> > > > >> > >> > > > > requests
> > > > >> > >> > > > > > > > will
> > > > >> > >> > > > > > > > > > have
> > > > >> > >> > > > > > > > > > > > > quite
> > > > >> > >> > > > > > > > > > > > > > > different "cost", and Kafka today
> > > > already
> > > > >> > have
> > > > >> > >> > > > various
> > > > >> > >> > > > > > > > request
> > > > >> > >> > > > > > > > > > > types
> > > > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata,
> > > etc),
> > > > >> > >> because
> > > > >> > >> > of
> > > > >> > >> > > > that
> > > > >> > >> > > > > > the
> > > > >> > >> > > > > > > > > > request
> > > > >> > >> > > > > > > > > > > > > rate
> > > > >> > >> > > > > > > > > > > > > > > throttling may not be as
> effective
> > > > >> unless it
> > > > >> > >> is
> > > > >> > >> > set
> > > > >> > >> > > > > very
> > > > >> > >> > > > > > > > > > > > > conservatively.
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > Regarding to user reactions when
> they
> > > > are
> > > > >> > >> > > throttled,
> > > > >> > >> > > > I
> > > > >> > >> > > > > > > think
> > > > >> > >> > > > > > > > it
> > > > >> > >> > > > > > > > > > may
> > > > >> > >> > > > > > > > > > > > > > differ
> > > > >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
> > > > discovered /
> > > > >> > >> guided
> > > > >> > >> > by
> > > > >> > >> > > > > > looking
> > > > >> > >> > > > > > > > at
> > > > >> > >> > > > > > > > > > > > relative
> > > > >> > >> > > > > > > > > > > > > > > metrics. So in other words users
> > > would
> > > > >> not
> > > > >> > >> expect
> > > > >> > >> > > to
> > > > >> > >> > > > > get
> > > > >> > >> > > > > > > > > > additional
> > > > >> > >> > > > > > > > > > > > > > > information by simply being told
> > > "hey,
> > > > >> you
> > > > >> > are
> > > > >> > >> > > > > > throttled",
> > > > >> > >> > > > > > > > > which
> > > > >> > >> > > > > > > > > > is
> > > > >> > >> > > > > > > > > > > > all
> > > > >> > >> > > > > > > > > > > > > > > what throttling does; they need
> to
> > > > take a
> > > > >> > >> > follow-up
> > > > >> > >> > > > > step
> > > > >> > >> > > > > > > and
> > > > >> > >> > > > > > > > > see
> > > > >> > >> > > > > > > > > > > > "hmm,
> > > > >> > >> > > > > > > > > > > > > > I'm
> > > > >> > >> > > > > > > > > > > > > > > throttled probably because of
> ..",
> > > > which
> > > > >> is
> > > > >> > by
> > > > >> > >> > > > looking
> > > > >> > >> > > > > at
> > > > >> > >> > > > > > > > other
> > > > >> > >> > > > > > > > > > > > metric
> > > > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm
> bombarding
> > > the
> > > > >> > >> brokers
> > > > >> > >> > > with
> > > > >> > >> > > > > > > metadata
> > > > >> > >> > > > > > > > > > > > request,
> > > > >> > >> > > > > > > > > > > > > > > which are usually cheap to
> handle but
> > > > I'm
> > > > >> > >> sending
> > > > >> > >> > > > > > thousands
> > > > >> > >> > > > > > > > per
> > > > >> > >> > > > > > > > > > > > second;
> > > > >> > >> > > > > > > > > > > > > > or
> > > > >> > >> > > > > > > > > > > > > > > is it because I'm catching up and
> > > hence
> > > > >> > >> sending
> > > > >> > >> > > very
> > > > >> > >> > > > > > heavy
> > > > >> > >> > > > > > > > > > fetching
> > > > >> > >> > > > > > > > > > > > > > request
> > > > >> > >> > > > > > > > > > > > > > > with large min.bytes, etc.
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > Regarding to the implementation,
> as
> > > > once
> > > > >> > >> > discussed
> > > > >> > >> > > > with
> > > > >> > >> > > > > > > Jun,
> > > > >> > >> > > > > > > > > this
> > > > >> > >> > > > > > > > > > > > seems
> > > > >> > >> > > > > > > > > > > > > > not
> > > > >> > >> > > > > > > > > > > > > > > very difficult since today we are
> > > > already
> > > > >> > >> > > collecting
> > > > >> > >> > > > > the
> > > > >> > >> > > > > > > > > "thread
> > > > >> > >> > > > > > > > > > > pool
> > > > >> > >> > > > > > > > > > > > > > > utilization" metrics, which is a
> > > single
> > > > >> > >> > percentage
> > > > >> > >> > > > > > > > > > > > "aggregateIdleMeter"
> > > > >> > >> > > > > > > > > > > > > > > value; but we are already
> effectively
> > > > >> > >> aggregating
> > > > >> > >> > > it
> > > > >> > >> > > > > for
> > > > >> > >> > > > > > > each
> > > > >> > >> > > > > > > > > > > > requests
> > > > >> > >> > > > > > > > > > > > > in
> > > > >> > >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can
> just
> > > > >> extend
> > > > >> > >> it by
> > > > >> > >> > > > > > recording
> > > > >> > >> > > > > > > > the
> > > > >> > >> > > > > > > > > > > > source
> > > > >> > >> > > > > > > > > > > > > > > client id when handling them and
> > > > >> aggregating
> > > > >> > >> by
> > > > >> > >> > > > > clientId
> > > > >> > >> > > > > > as
> > > > >> > >> > > > > > > > > well
> > > > >> > >> > > > > > > > > > as
> > > > >> > >> > > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > > > total aggregate.
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > Guozhang
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM,
> Jay
> > > > >> Kreps <
> > > > >> > >> > > > > > > jay@confluent.io
> > > > >> > >> > > > > > > > >
> > > > >> > >> > > > > > > > > > > wrote:
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
> > > > >> > >> > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > When I thought about it more
> > > deeply I
> > > > >> came
> > > > >> > >> > around
> > > > >> > >> > > > to
> > > > >> > >> > > > > > the
> > > > >> > >> > > > > > > > > > "percent
> > > > >> > >> > > > > > > > > > > > of
> > > > >> > >> > > > > > > > > > > > > > > > processing time" metric too. It
> > > > seems a
> > > > >> > lot
> > > > >> > >> > > closer
> > > > >> > >> > > > to
> > > > >> > >> > > > > > the
> > > > >> > >> > > > > > > > > thing
> > > > >> > >> > > > > > > > > > > we
> > > > >> > >> > > > > > > > > > > > > > > actually
> > > > >> > >> > > > > > > > > > > > > > > > care about and need to
> protect. I
> > > > also
> > > > >> > think
> > > > >> > >> > this
> > > > >> > >> > > > > would
> > > > >> > >> > > > > > > be
> > > > >> > >> > > > > > > > a
> > > > >> > >> > > > > > > > > > very
> > > > >> > >> > > > > > > > > > > > > > useful
> > > > >> > >> > > > > > > > > > > > > > > > metric even in the absence of
> > > > >> throttling
> > > > >> > >> just
> > > > >> > >> > to
> > > > >> > >> > > > > debug
> > > > >> > >> > > > > > > > whose
> > > > >> > >> > > > > > > > > > > using
> > > > >> > >> > > > > > > > > > > > > > > > capacity.
> > > > >> > >> > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > Two problems to consider:
> > > > >> > >> > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > >    1. I agree that for the
> user it
> > > is
> > > > >> > >> > > > understandable
> > > > >> > >> > > > > > what
> > > > >> > >> > > > > > > > > lead
> > > > >> > >> > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > > their
> > > > >> > >> > > > > > > > > > > > > > > >    being throttled, but it is
> a bit
> > > > >> hard
> > > > >> > to
> > > > >> > >> > > figure
> > > > >> > >> > > > > out
> > > > >> > >> > > > > > > the
> > > > >> > >> > > > > > > > > safe
> > > > >> > >> > > > > > > > > > > > range
> > > > >> > >> > > > > > > > > > > > > > for
> > > > >> > >> > > > > > > > > > > > > > > >    them. i.e. if I have a new
> app
> > > > that
> > > > >> > will
> > > > >> > >> > send
> > > > >> > >> > > > 200
> > > > >> > >> > > > > > > > > > > messages/sec I
> > > > >> > >> > > > > > > > > > > > > can
> > > > >> > >> > > > > > > > > > > > > > > >    probably reason that I'll be
> > > under
> > > > >> the
> > > > >> > >> > > > throttling
> > > > >> > >> > > > > > > limit
> > > > >> > >> > > > > > > > of
> > > > >> > >> > > > > > > > > > 300
> > > > >> > >> > > > > > > > > > > > > > > req/sec.
> > > > >> > >> > > > > > > > > > > > > > > >    However if I need to be
> under a
> > > > 10%
> > > > >> CPU
> > > > >> > >> > > > resources
> > > > >> > >> > > > > > > limit
> > > > >> > >> > > > > > > > it
> > > > >> > >> > > > > > > > > > may
> > > > >> > >> > > > > > > > > > > > be
> > > > >> > >> > > > > > > > > > > > > a
> > > > >> > >> > > > > > > > > > > > > > > bit
> > > > >> > >> > > > > > > > > > > > > > > >    harder for me to know a
> priori
> > > if
> > > > i
> > > > >> > will
> > > > >> > >> or
> > > > >> > >> > > > won't.
> > > > >> > >> > > > > > > > > > > > > > > >    2. Calculating the
> available CPU
> > > > >> time
> > > > >> > is
> > > > >> > >> a
> > > > >> > >> > bit
> > > > >> > >> > > > > > > difficult
> > > > >> > >> > > > > > > > > > since
> > > > >> > >> > > > > > > > > > > > > there
> > > > >> > >> > > > > > > > > > > > > > > are
> > > > >> > >> > > > > > > > > > > > > > > >    actually two thread
> pools--the
> > > I/O
> > > > >> > >> threads
> > > > >> > >> > and
> > > > >> > >> > > > the
> > > > >> > >> > > > > > > > network
> > > > >> > >> > > > > > > > > > > > > threads.
> > > > >> > >> > > > > > > > > > > > > > I
> > > > >> > >> > > > > > > > > > > > > > > > think
> > > > >> > >> > > > > > > > > > > > > > > >    it might be workable to
> count
> > > just
> > > > >> the
> > > > >> > >> I/O
> > > > >> > >> > > > thread
> > > > >> > >> > > > > > time
> > > > >> > >> > > > > > > > as
> > > > >> > >> > > > > > > > > in
> > > > >> > >> > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > > > > proposal,
> > > > >> > >> > > > > > > > > > > > > > > >    but the network thread work
> is
> > > > >> actually
> > > > >> > >> > > > > non-trivial
> > > > >> > >> > > > > > > > (e.g.
> > > > >> > >> > > > > > > > > > all
> > > > >> > >> > > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > > disk
> > > > >> > >> > > > > > > > > > > > > > > >    reads for fetches happen in
> that
> > > > >> > >> thread). If
> > > > >> > >> > > you
> > > > >> > >> > > > > > count
> > > > >> > >> > > > > > > > > both
> > > > >> > >> > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > > > network
> > > > >> > >> > > > > > > > > > > > > > > > and
> > > > >> > >> > > > > > > > > > > > > > > >    I/O threads it can skew
> things a
> > > > >> bit.
> > > > >> > >> E.g.
> > > > >> > >> > say
> > > > >> > >> > > > you
> > > > >> > >> > > > > > > have
> > > > >> > >> > > > > > > > 50
> > > > >> > >> > > > > > > > > > > > network
> > > > >> > >> > > > > > > > > > > > > > > > threads,
> > > > >> > >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores,
> > > what
> > > > is
> > > > >> > the
> > > > >> > >> > > > available
> > > > >> > >> > > > > > cpu
> > > > >> > >> > > > > > > > > time
> > > > >> > >> > > > > > > > > > > > > > available
> > > > >> > >> > > > > > > > > > > > > > > > in a
> > > > >> > >> > > > > > > > > > > > > > > >    second? I suppose this is a
> > > > problem
> > > > >> > >> whenever
> > > > >> > >> > > you
> > > > >> > >> > > > > > have
> > > > >> > >> > > > > > > a
> > > > >> > >> > > > > > > > > > > > bottleneck
> > > > >> > >> > > > > > > > > > > > > > > > between
> > > > >> > >> > > > > > > > > > > > > > > >    I/O and network threads or
> if
> > > you
> > > > >> end
> > > > >> > up
> > > > >> > >> > > > > > significantly
> > > > >> > >> > > > > > > > > > > > > > > over-provisioning
> > > > >> > >> > > > > > > > > > > > > > > >    one pool (both of which are
> hard
> > > > to
> > > > >> > >> avoid).
> > > > >> > >> > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > An alternative for CPU
> throttling
> > > > >> would be
> > > > >> > >> to
> > > > >> > >> > use
> > > > >> > >> > > > > this
> > > > >> > >> > > > > > > api:
> > > > >> > >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> > > > >> > >> > > > > > 1.5.0/docs/api/java/lang/
> > > > >> > >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
> > > > >> > >> > > > getThreadCpuTime(long)
> > > > >> > >> > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > That would let you track
> actual CPU
> > > > >> usage
> > > > >> > >> > across
> > > > >> > >> > > > the
> > > > >> > >> > > > > > > > network,
> > > > >> > >> > > > > > > > > > I/O
> > > > >> > >> > > > > > > > > > > > > > > threads,
> > > > >> > >> > > > > > > > > > > > > > > > and purgatory threads and look
> at
> > > it
> > > > >> as a
> > > > >> > >> > > > percentage
> > > > >> > >> > > > > of
> > > > >> > >> > > > > > > > total
> > > > >> > >> > > > > > > > > > > > cores.
> > > > >> > >> > > > > > > > > > > > > I
> > > > >> > >> > > > > > > > > > > > > > > > think this fixes many problems
> in
> > > the
> > > > >> > >> > reliability
> > > > >> > >> > > > of
> > > > >> > >> > > > > > the
> > > > >> > >> > > > > > > > > > metric.
> > > > >> > >> > > > > > > > > > > > It's
> > > > >> > >> > > > > > > > > > > > > > > > meaning is slightly different
> as it
> > > > is
> > > > >> > just
> > > > >> > >> CPU
> > > > >> > >> > > > (you
> > > > >> > >> > > > > > > don't
> > > > >> > >> > > > > > > > > get
> > > > >> > >> > > > > > > > > > > > > charged
> > > > >> > >> > > > > > > > > > > > > > > for
> > > > >> > >> > > > > > > > > > > > > > > > time blocking on I/O) but that
> may
> > > be
> > > > >> okay
> > > > >> > >> > > because
> > > > >> > >> > > > we
> > > > >> > >> > > > > > > > already
> > > > >> > >> > > > > > > > > > > have
> > > > >> > >> > > > > > > > > > > > a
> > > > >> > >> > > > > > > > > > > > > > > > throttle on I/O. The downside
> is I
> > > > >> think
> > > > >> > it
> > > > >> > >> is
> > > > >> > >> > > > > possible
> > > > >> > >> > > > > > > > this
> > > > >> > >> > > > > > > > > > api
> > > > >> > >> > > > > > > > > > > > can
> > > > >> > >> > > > > > > > > > > > > be
> > > > >> > >> > > > > > > > > > > > > > > > disabled or isn't always
> available
> > > > and
> > > > >> it
> > > > >> > >> may
> > > > >> > >> > > also
> > > > >> > >> > > > be
> > > > >> > >> > > > > > > > > expensive
> > > > >> > >> > > > > > > > > > > > (also
> > > > >> > >> > > > > > > > > > > > > > > I've
> > > > >> > >> > > > > > > > > > > > > > > > never used it so not sure if it
> > > > really
> > > > >> > works
> > > > >> > >> > the
> > > > >> > >> > > > way
> > > > >> > >> > > > > i
> > > > >> > >> > > > > > > > > think).
> > > > >> > >> > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > -Jay
> > > > >> > >> > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17
> PM,
> > > > Becket
> > > > >> > Qin
> > > > >> > >> <
> > > > >> > >> > > > > > > > > > > becket.qin@gmail.com>
> > > > >> > >> > > > > > > > > > > > > > > wrote:
> > > > >> > >> > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > If the purpose of the KIP is
> only
> > > > to
> > > > >> > >> protect
> > > > >> > >> > > the
> > > > >> > >> > > > > > > cluster
> > > > >> > >> > > > > > > > > from
> > > > >> > >> > > > > > > > > > > > being
> > > > >> > >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients
> and
> > > is
> > > > >> not
> > > > >> > >> > > intended
> > > > >> > >> > > > to
> > > > >> > >> > > > > > > > address
> > > > >> > >> > > > > > > > > > > > > resource
> > > > >> > >> > > > > > > > > > > > > > > > > allocation problem among the
> > > > >> clients, I
> > > > >> > am
> > > > >> > >> > > > > wondering
> > > > >> > >> > > > > > if
> > > > >> > >> > > > > > > > > using
> > > > >> > >> > > > > > > > > > > > > request
> > > > >> > >> > > > > > > > > > > > > > > > > handling time quota (CPU time
> > > > quota)
> > > > >> is
> > > > >> > a
> > > > >> > >> > > better
> > > > >> > >> > > > > > > option.
> > > > >> > >> > > > > > > > > Here
> > > > >> > >> > > > > > > > > > > are
> > > > >> > >> > > > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > > > > > reasons:
> > > > >> > >> > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > 1. request handling time
> quota
> > > has
> > > > >> > better
> > > > >> > >> > > > > protection.
> > > > >> > >> > > > > > > Say
> > > > >> > >> > > > > > > > > we
> > > > >> > >> > > > > > > > > > > have
> > > > >> > >> > > > > > > > > > > > > > > request
> > > > >> > >> > > > > > > > > > > > > > > > > rate quota and set that to
> some
> > > > value
> > > > >> > like
> > > > >> > >> > 100
> > > > >> > >> > > > > > > > > requests/sec,
> > > > >> > >> > > > > > > > > > it
> > > > >> > >> > > > > > > > > > > > is
> > > > >> > >> > > > > > > > > > > > > > > > possible
> > > > >> > >> > > > > > > > > > > > > > > > > that some of the requests are
> > > very
> > > > >> > >> expensive
> > > > >> > >> > > > > actually
> > > > >> > >> > > > > > > > take
> > > > >> > >> > > > > > > > > a
> > > > >> > >> > > > > > > > > > > lot
> > > > >> > >> > > > > > > > > > > > of
> > > > >> > >> > > > > > > > > > > > > > > time
> > > > >> > >> > > > > > > > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > > > > > > handle. In that case a few
> > > clients
> > > > >> may
> > > > >> > >> still
> > > > >> > >> > > > > occupy a
> > > > >> > >> > > > > > > lot
> > > > >> > >> > > > > > > > > of
> > > > >> > >> > > > > > > > > > > CPU
> > > > >> > >> > > > > > > > > > > > > time
> > > > >> > >> > > > > > > > > > > > > > > > even
> > > > >> > >> > > > > > > > > > > > > > > > > the request rate is low.
> Arguably
> > > > we
> > > > >> can
> > > > >> > >> > > > carefully
> > > > >> > >> > > > > > set
> > > > >> > >> > > > > > > > > > request
> > > > >> > >> > > > > > > > > > > > rate
> > > > >> > >> > > > > > > > > > > > > > > quota
> > > > >> > >> > > > > > > > > > > > > > > > > for each request and client
> id
> > > > >> > >> combination,
> > > > >> > >> > but
> > > > >> > >> > > > it
> > > > >> > >> > > > > > > could
> > > > >> > >> > > > > > > > > > still
> > > > >> > >> > > > > > > > > > > be
> > > > >> > >> > > > > > > > > > > > > > > tricky
> > > > >> > >> > > > > > > > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > > > > > > get it right for everyone.
> > > > >> > >> > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > If we use the request time
> > > handling
> > > > >> > >> quota, we
> > > > >> > >> > > can
> > > > >> > >> > > > > > > simply
> > > > >> > >> > > > > > > > > say
> > > > >> > >> > > > > > > > > > no
> > > > >> > >> > > > > > > > > > > > > > clients
> > > > >> > >> > > > > > > > > > > > > > > > can
> > > > >> > >> > > > > > > > > > > > > > > > > take up to more than 30% of
> the
> > > > total
> > > > >> > >> request
> > > > >> > >> > > > > > handling
> > > > >> > >> > > > > > > > > > capacity
> > > > >> > >> > > > > > > > > > > > > > > (measured
> > > > >> > >> > > > > > > > > > > > > > > > > by time), regardless of the
> > > > >> difference
> > > > >> > >> among
> > > > >> > >> > > > > > different
> > > > >> > >> > > > > > > > > > requests
> > > > >> > >> > > > > > > > > > > > or
> > > > >> > >> > > > > > > > > > > > > > what
> > > > >> > >> > > > > > > > > > > > > > > > is
> > > > >> > >> > > > > > > > > > > > > > > > > the client doing. In this
> case
> > > > maybe
> > > > >> we
> > > > >> > >> can
> > > > >> > >> > > quota
> > > > >> > >> > > > > all
> > > > >> > >> > > > > > > the
> > > > >> > >> > > > > > > > > > > > requests
> > > > >> > >> > > > > > > > > > > > > if
> > > > >> > >> > > > > > > > > > > > > > > we
> > > > >> > >> > > > > > > > > > > > > > > > > want to.
> > > > >> > >> > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > 2. The main benefit of using
> > > > request
> > > > >> > rate
> > > > >> > >> > limit
> > > > >> > >> > > > is
> > > > >> > >> > > > > > that
> > > > >> > >> > > > > > > > it
> > > > >> > >> > > > > > > > > > > seems
> > > > >> > >> > > > > > > > > > > > > more
> > > > >> > >> > > > > > > > > > > > > > > > > intuitive. It is true that
> it is
> > > > >> > probably
> > > > >> > >> > > easier
> > > > >> > >> > > > to
> > > > >> > >> > > > > > > > explain
> > > > >> > >> > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > > user
> > > > >> > >> > > > > > > > > > > > > > > > > what does that mean.
> However, in
> > > > >> > practice
> > > > >> > >> it
> > > > >> > >> > > > looks
> > > > >> > >> > > > > > the
> > > > >> > >> > > > > > > > > impact
> > > > >> > >> > > > > > > > > > > of
> > > > >> > >> > > > > > > > > > > > > > > request
> > > > >> > >> > > > > > > > > > > > > > > > > rate quota is not more
> > > quantifiable
> > > > >> than
> > > > >> > >> the
> > > > >> > >> > > > > request
> > > > >> > >> > > > > > > > > handling
> > > > >> > >> > > > > > > > > > > > time
> > > > >> > >> > > > > > > > > > > > > > > quota.
> > > > >> > >> > > > > > > > > > > > > > > > > Unlike the byte rate quota,
> it is
> > > > >> still
> > > > >> > >> > > difficult
> > > > >> > >> > > > > to
> > > > >> > >> > > > > > > > give a
> > > > >> > >> > > > > > > > > > > > number
> > > > >> > >> > > > > > > > > > > > > > > about
> > > > >> > >> > > > > > > > > > > > > > > > > impact of throughput or
> latency
> > > > when
> > > > >> a
> > > > >> > >> > request
> > > > >> > >> > > > rate
> > > > >> > >> > > > > > > quota
> > > > >> > >> > > > > > > > > is
> > > > >> > >> > > > > > > > > > > hit.
> > > > >> > >> > > > > > > > > > > > > So
> > > > >> > >> > > > > > > > > > > > > > it
> > > > >> > >> > > > > > > > > > > > > > > > is
> > > > >> > >> > > > > > > > > > > > > > > > > not better than the request
> > > > handling
> > > > >> > time
> > > > >> > >> > > quota.
> > > > >> > >> > > > In
> > > > >> > >> > > > > > > fact
> > > > >> > >> > > > > > > > I
> > > > >> > >> > > > > > > > > > feel
> > > > >> > >> > > > > > > > > > > > it
> > > > >> > >> > > > > > > > > > > > > is
> > > > >> > >> > > > > > > > > > > > > > > > > clearer to tell user that
> "you
> > > are
> > > > >> > limited
> > > > >> > >> > > > because
> > > > >> > >> > > > > > you
> > > > >> > >> > > > > > > > have
> > > > >> > >> > > > > > > > > > > taken
> > > > >> > >> > > > > > > > > > > > > 30%
> > > > >> > >> > > > > > > > > > > > > > > of
> > > > >> > >> > > > > > > > > > > > > > > > > the CPU time on the broker"
> than
> > > > >> > otherwise
> > > > >> > >> > > > > something
> > > > >> > >> > > > > > > like
> > > > >> > >> > > > > > > > > > "your
> > > > >> > >> > > > > > > > > > > > > > request
> > > > >> > >> > > > > > > > > > > > > > > > > rate quota on metadata
> request
> > > has
> > > > >> > >> reached".
> > > > >> > >> > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > Thanks,
> > > > >> > >> > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > >> > >> > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23
> PM,
> > > > Jay
> > > > >> > >> Kreps <
> > > > >> > >> > > > > > > > > jay@confluent.io
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > wrote:
> > > > >> > >> > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > I think this proposal
> makes a
> > > lot
> > > > >> of
> > > > >> > >> sense
> > > > >> > >> > > > > > > (especially
> > > > >> > >> > > > > > > > > now
> > > > >> > >> > > > > > > > > > > that
> > > > >> > >> > > > > > > > > > > > > it
> > > > >> > >> > > > > > > > > > > > > > is
> > > > >> > >> > > > > > > > > > > > > > > > > > oriented around request
> rate)
> > > and
> > > > >> > fills
> > > > >> > >> the
> > > > >> > >> > > > > biggest
> > > > >> > >> > > > > > > > > > remaining
> > > > >> > >> > > > > > > > > > > > gap
> > > > >> > >> > > > > > > > > > > > > > in
> > > > >> > >> > > > > > > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > > > > > > multi-tenancy story.
> > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > I think for intra-cluster
> > > > >> > communication
> > > > >> > >> > > > > > (StopReplica,
> > > > >> > >> > > > > > > > > etc)
> > > > >> > >> > > > > > > > > > we
> > > > >> > >> > > > > > > > > > > > > could
> > > > >> > >> > > > > > > > > > > > > > > > avoid
> > > > >> > >> > > > > > > > > > > > > > > > > > throttling entirely. You
> can
> > > > >> secure or
> > > > >> > >> > > > otherwise
> > > > >> > >> > > > > > > > > lock-down
> > > > >> > >> > > > > > > > > > > the
> > > > >> > >> > > > > > > > > > > > > > > cluster
> > > > >> > >> > > > > > > > > > > > > > > > > > communication to avoid any
> > > > >> > unauthorized
> > > > >> > >> > > > external
> > > > >> > >> > > > > > > party
> > > > >> > >> > > > > > > > > from
> > > > >> > >> > > > > > > > > > > > > trying
> > > > >> > >> > > > > > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > > > > > > > initiate these requests.
> As a
> > > > >> result
> > > > >> > we
> > > > >> > >> are
> > > > >> > >> > > as
> > > > >> > >> > > > > > likely
> > > > >> > >> > > > > > > > to
> > > > >> > >> > > > > > > > > > > cause
> > > > >> > >> > > > > > > > > > > > > > > problems
> > > > >> > >> > > > > > > > > > > > > > > > > as
> > > > >> > >> > > > > > > > > > > > > > > > > > solve them by throttling
> these,
> > > > >> right?
> > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > I'm not so sure that we
> should
> > > > >> exempt
> > > > >> > >> the
> > > > >> > >> > > > > consumer
> > > > >> > >> > > > > > > > > requests
> > > > >> > >> > > > > > > > > > > > such
> > > > >> > >> > > > > > > > > > > > > as
> > > > >> > >> > > > > > > > > > > > > > > > > > heartbeat. It's true that
> if we
> > > > >> > >> throttle an
> > > > >> > >> > > > app's
> > > > >> > >> > > > > > > > > heartbeat
> > > > >> > >> > > > > > > > > > > > > > requests
> > > > >> > >> > > > > > > > > > > > > > > it
> > > > >> > >> > > > > > > > > > > > > > > > > may
> > > > >> > >> > > > > > > > > > > > > > > > > > cause it to fall out of its
> > > > >> consumer
> > > > >> > >> group.
> > > > >> > >> > > > > However
> > > > >> > >> > > > > > > if
> > > > >> > >> > > > > > > > we
> > > > >> > >> > > > > > > > > > > don't
> > > > >> > >> > > > > > > > > > > > > > > > throttle
> > > > >> > >> > > > > > > > > > > > > > > > > it
> > > > >> > >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if
> the
> > > > >> > heartbeat
> > > > >> > >> > > > interval
> > > > >> > >> > > > > > is
> > > > >> > >> > > > > > > > set
> > > > >> > >> > > > > > > > > > > > > > incorrectly
> > > > >> > >> > > > > > > > > > > > > > > or
> > > > >> > >> > > > > > > > > > > > > > > > > if
> > > > >> > >> > > > > > > > > > > > > > > > > > some client in some
> language
> > > has
> > > > a
> > > > >> > bug.
> > > > >> > >> I
> > > > >> > >> > > think
> > > > >> > >> > > > > the
> > > > >> > >> > > > > > > > > policy
> > > > >> > >> > > > > > > > > > > with
> > > > >> > >> > > > > > > > > > > > > > this
> > > > >> > >> > > > > > > > > > > > > > > > kind
> > > > >> > >> > > > > > > > > > > > > > > > > > of throttling is to
> protect the
> > > > >> > cluster
> > > > >> > >> > above
> > > > >> > >> > > > any
> > > > >> > >> > > > > > > > > > individual
> > > > >> > >> > > > > > > > > > > > app,
> > > > >> > >> > > > > > > > > > > > > > > > right?
> > > > >> > >> > > > > > > > > > > > > > > > > I
> > > > >> > >> > > > > > > > > > > > > > > > > > think in general this
> should be
> > > > >> okay
> > > > >> > >> since
> > > > >> > >> > > for
> > > > >> > >> > > > > most
> > > > >> > >> > > > > > > > > > > deployments
> > > > >> > >> > > > > > > > > > > > > > this
> > > > >> > >> > > > > > > > > > > > > > > > > > setting is meant as more
> of a
> > > > >> safety
> > > > >> > >> > > > valve---that
> > > > >> > >> > > > > > is
> > > > >> > >> > > > > > > > > rather
> > > > >> > >> > > > > > > > > > > > than
> > > > >> > >> > > > > > > > > > > > > > set
> > > > >> > >> > > > > > > > > > > > > > > > > > something very close to
> what
> > > you
> > > > >> > expect
> > > > >> > >> to
> > > > >> > >> > > need
> > > > >> > >> > > > > > (say
> > > > >> > >> > > > > > > 2
> > > > >> > >> > > > > > > > > > > req/sec
> > > > >> > >> > > > > > > > > > > > or
> > > > >> > >> > > > > > > > > > > > > > > > > whatever)
> > > > >> > >> > > > > > > > > > > > > > > > > > you would have something
> quite
> > > > high
> > > > >> > >> (like
> > > > >> > >> > 100
> > > > >> > >> > > > > > > req/sec)
> > > > >> > >> > > > > > > > > with
> > > > >> > >> > > > > > > > > > > > this
> > > > >> > >> > > > > > > > > > > > > > > meant
> > > > >> > >> > > > > > > > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > > > > > > > prevent a client gone
> crazy. I
> > > > >> think
> > > > >> > >> when
> > > > >> > >> > > used
> > > > >> > >> > > > > this
> > > > >> > >> > > > > > > way
> > > > >> > >> > > > > > > > > > > > allowing
> > > > >> > >> > > > > > > > > > > > > > > those
> > > > >> > >> > > > > > > > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > > > > > > > be throttled would actually
> > > > provide
> > > > >> > >> > > meaningful
> > > > >> > >> > > > > > > > > protection.
> > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > -Jay
> > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at
> 9:05
> > > AM,
> > > > >> > Rajini
> > > > >> > >> > > > Sivaram <
> > > > >> > >> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > wrote:
> > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > > Hi all,
> > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > > I have just created
> KIP-124
> > > to
> > > > >> > >> introduce
> > > > >> > >> > > > > request
> > > > >> > >> > > > > > > rate
> > > > >> > >> > > > > > > > > > > quotas
> > > > >> > >> > > > > > > > > > > > to
> > > > >> > >> > > > > > > > > > > > > > > > Kafka:
> > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > >
> https://cwiki.apache.org/
> > > > >> > >> > > > > > > > confluence/display/KAFKA/KIP-
> > > > >> > >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > > The proposal is for a
> simple
> > > > >> > >> percentage
> > > > >> > >> > > > request
> > > > >> > >> > > > > > > > > handling
> > > > >> > >> > > > > > > > > > > time
> > > > >> > >> > > > > > > > > > > > > > quota
> > > > >> > >> > > > > > > > > > > > > > > > > that
> > > > >> > >> > > > > > > > > > > > > > > > > > > can be allocated to
> > > > >> *<client-id>*,
> > > > >> > >> > *<user>*
> > > > >> > >> > > > or
> > > > >> > >> > > > > > > > *<user,
> > > > >> > >> > > > > > > > > > > > > > client-id>*.
> > > > >> > >> > > > > > > > > > > > > > > > > There
> > > > >> > >> > > > > > > > > > > > > > > > > > > are a few other
> suggestions
> > > > also
> > > > >> > under
> > > > >> > >> > > > > "Rejected
> > > > >> > >> > > > > > > > > > > > alternatives".
> > > > >> > >> > > > > > > > > > > > > > > > > Feedback
> > > > >> > >> > > > > > > > > > > > > > > > > > > and suggestions are
> welcome.
> > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > > Thank you...
> > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > > Regards,
> > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > > > Rajini
> > > > >> > >> > > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > > > --
> > > > >> > >> > > > > > > > > > > > > > > -- Guozhang
> > > > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > > >
> > > > >> > >> > > > > > > > > > >
> > > > >> > >> > > > > > > > > >
> > > > >> > >> > > > > > > > >
> > > > >> > >> > > > > > > >
> > > > >> > >> > > > > > >
> > > > >> > >> > > > > >
> > > > >> > >> > > > > >
> > > > >> > >> > > > > >
> > > > >> > >> > > > > > --
> > > > >> > >> > > > > > -- Guozhang
> > > > >> > >> > > > > >
> > > > >> > >> > > > >
> > > > >> > >> > > >
> > > > >> > >> > >
> > > > >> > >> >
> > > > >> > >>
> > > > >> > >
> > > > >> > >
> > > > >> >
> > > > >>
> > > > >
> > > > >
> > > >
> > >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Colin McCabe <cm...@apache.org>.
I noticed that the throttle_time_ms added to all the message responses
is in milliseconds.  Does it make sense to express this in microseconds
in case we start doing more fine-grained CPU throttling later on?  An
int32 should still be more than enough if using microseconds.

best,
Colin


On Fri, Feb 24, 2017, at 10:31, Jun Rao wrote:
> Hi, Jay,
> 
> 2. Regarding request.unit vs request.percentage. I started with
> request.percentage too. The reasoning for request.unit is the following.
> Suppose that the capacity has been reached on a broker and the admin
> needs
> to add a new user. A simple way to increase the capacity is to increase
> the
> number of io threads, assuming there are still enough cores. If the limit
> is based on percentage, the additional capacity automatically gets
> distributed to existing users and we haven't really carved out any
> additional resource for the new user. Now, is it easy for a user to
> reason
> about 0.1 unit vs 10%. My feeling is that both are hard and have to be
> configured empirically. Not sure if percentage is obviously easier to
> reason about.
> 
> Thanks,
> 
> Jun
> 
> On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io> wrote:
> 
> > A couple of quick points:
> >
> > 1. Even though the implementation of this quota is only using io thread
> > time, i think we should call it something like "request-time". This will
> > give us flexibility to improve the implementation to cover network threads
> > in the future and will avoid exposing internal details like our thread
> > pools on the server.
> >
> > 2. Jun/Roger, I get what you are trying to fix but the idea of thread/units
> > is super unintuitive as a user-facing knob. I had to read the KIP like
> > eight times to understand this. I'm not sure that your point that
> > increasing the number of threads is a problem with a percentage-based
> > value, it really depends on whether the user thinks about the "percentage
> > of request processing time" or "thread units". If they think "I have
> > allocated 10% of my request processing time to user x" then it is a bug
> > that increasing the thread count decreases that percent as it does in the
> > current proposal. As a practical matter I think the only way to actually
> > reason about this is as a percent---I just don't believe people are going
> > to think, "ah, 4.3 thread units, that is the right amount!". Instead I
> > think they have to understand this thread unit concept, figure out what
> > they have set in number of threads, compute a percent and then come up with
> > the number of thread units, and these will all be wrong if that thread
> > count changes. I also think this ties us to throttling the I/O thread pool,
> > which may not be where we want to end up.
> >
> > 3. For what it's worth I do think having a single throttle_ms field in all
> > the responses that combines all throttling from all quotas is probably the
> > simplest. There could be a use case for having separate fields for each,
> > but I think that is actually harder to use/monitor in the common case so
> > unless someone has a use case I think just one should be fine.
> >
> > -Jay
> >
> > On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <ra...@gmail.com>
> > wrote:
> >
> > > I have updated the KIP based on the discussions so far.
> > >
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > rajinisivaram@gmail.com>
> > > wrote:
> > >
> > > > Thank you all for the feedback.
> > > >
> > > > Ismael #1. It makes sense not to throttle inter-broker requests like
> > > > LeaderAndIsr etc. The simplest way to ensure that clients cannot use
> > > these
> > > > requests to bypass quotas for DoS attacks is to ensure that ACLs
> > prevent
> > > > clients from using these requests and unauthorized requests are
> > included
> > > > towards quotas.
> > > >
> > > > Ismael #2, Jay #1 : I was thinking that these quotas can return a
> > > separate
> > > > throttle time, and all utilization based quotas could use the same
> > field
> > > > (we won't add another one for network thread utilization for instance).
> > > But
> > > > perhaps it makes sense to keep byte rate quotas separate in
> > produce/fetch
> > > > responses to provide separate metrics? Agree with Ismael that the name
> > of
> > > > the existing field should be changed if we have two. Happy to switch
> > to a
> > > > single combined throttle time if that is sufficient.
> > > >
> > > > Ismael #4, #5, #6: Will update KIP. Will use dot separated name for new
> > > > property. Replication quotas use dot separated, so it will be
> > consistent
> > > > with all properties except byte rate quotas.
> > > >
> > > > Radai: #1 Request processing time rather than request rate were chosen
> > > > because the time per request can vary significantly between requests as
> > > > mentioned in the discussion and KIP.
> > > > #2 Two separate quotas for heartbeats/regular requests feel like more
> > > > configuration and more metrics. Since most users would set quotas
> > higher
> > > > than the expected usage and quotas are more of a safety net, a single
> > > quota
> > > > should work in most cases.
> > > >  #3 The number of requests in purgatory is limited by the number of
> > > active
> > > > connections since only one request per connection will be throttled at
> > a
> > > > time.
> > > > #4 As with byte rate quotas, to use the full allocated quotas,
> > > > clients/users would need to use partitions that are distributed across
> > > the
> > > > cluster. The alternative of using cluster-wide quotas instead of
> > > per-broker
> > > > quotas would be far too complex to implement.
> > > >
> > > > Dong : We currently have two ClientQuotaManagers for quota types Fetch
> > > and
> > > > Produce. A new one will be added for IOThread, which manages quotas for
> > > I/O
> > > > thread utilization. This will not update the Fetch or Produce
> > queue-size,
> > > > but will have a separate metric for the queue-size.  I wasn't planning
> > to
> > > > add any additional metrics apart from the equivalent ones for existing
> > > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
> > utilization
> > > > could be slightly misleading since it depends on the sequence of
> > > requests.
> > > > But we can look into more metrics after the KIP is implemented if
> > > required.
> > > >
> > > > I think we need to limit the maximum delay since all requests are
> > > > throttled. If a client has a quota of 0.001 units and a single request
> > > used
> > > > 50ms, we don't want to delay all requests from the client by 50
> > seconds,
> > > > throwing the client out of all its consumer groups. The issue is only
> > if
> > > a
> > > > user is allocated a quota that is insufficient to process one large
> > > > request. The expectation is that the units allocated per user will be
> > > much
> > > > higher than the time taken to process one request and the limit should
> > > > seldom be applied. Agree this needs proper documentation.
> > > >
> > > > Regards,
> > > >
> > > > Rajini
> > > >
> > > >
> > > > On Thu, Feb 23, 2017 at 8:04 PM, radai <ra...@gmail.com>
> > > wrote:
> > > >
> > > >> @jun: i wasnt concerned about tying up a request processing thread,
> > but
> > > >> IIUC the code does still read the entire request out, which might
> > add-up
> > > >> to
> > > >> a non-negligible amount of memory.
> > > >>
> > > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com>
> > wrote:
> > > >>
> > > >> > Hey Rajini,
> > > >> >
> > > >> > The current KIP says that the maximum delay will be reduced to
> > window
> > > >> size
> > > >> > if it is larger than the window size. I have a concern with this:
> > > >> >
> > > >> > 1) This essentially means that the user is allowed to exceed their
> > > quota
> > > >> > over a long period of time. Can you provide an upper bound on this
> > > >> > deviation?
> > > >> >
> > > >> > 2) What is the motivation for cap the maximum delay by the window
> > > size?
> > > >> I
> > > >> > am wondering if there is better alternative to address the problem.
> > > >> >
> > > >> > 3) It means that the existing metric-related config will have a more
> > > >> > directly impact on the mechanism of this io-thread-unit-based quota.
> > > The
> > > >> > may be an important change depending on the answer to 1) above. We
> > > >> probably
> > > >> > need to document this more explicitly.
> > > >> >
> > > >> > Dong
> > > >> >
> > > >> >
> > > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <li...@gmail.com>
> > > wrote:
> > > >> >
> > > >> > > Hey Jun,
> > > >> > >
> > > >> > > Yeah you are right. I thought it wasn't because at LinkedIn it
> > will
> > > be
> > > >> > too
> > > >> > > much pressure on inGraph to expose those per-clientId metrics so
> > we
> > > >> ended
> > > >> > > up printing them periodically to local log. Never mind if it is
> > not
> > > a
> > > >> > > general problem.
> > > >> > >
> > > >> > > Hey Rajini,
> > > >> > >
> > > >> > > - I agree with Jay that we probably don't want to add a new field
> > > for
> > > >> > > every quota ProduceResponse or FetchResponse. Is there any
> > use-case
> > > >> for
> > > >> > > having separate throttle-time fields for byte-rate-quota and
> > > >> > > io-thread-unit-quota? You probably need to document this as
> > > interface
> > > >> > > change if you plan to add new field in any request.
> > > >> > >
> > > >> > > - I don't think IOThread belongs to quotaType. The existing quota
> > > >> types
> > > >> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication)
> > identify
> > > >> the
> > > >> > > type of request that are throttled, not the quota mechanism that
> > is
> > > >> > applied.
> > > >> > >
> > > >> > > - If a request is throttled due to this io-thread-unit-based
> > quota,
> > > is
> > > >> > the
> > > >> > > existing queue-size metric in ClientQuotaManager incremented?
> > > >> > >
> > > >> > > - In the interest of providing guide line for admin to decide
> > > >> > > io-thread-unit-based quota and for user to understand its impact
> > on
> > > >> their
> > > >> > > traffic, would it be useful to have a metric that shows the
> > overall
> > > >> > > byte-rate per io-thread-unit? Can we also show this a per-clientId
> > > >> > metric?
> > > >> > >
> > > >> > > Thanks,
> > > >> > > Dong
> > > >> > >
> > > >> > >
> > > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > >> > >
> > > >> > >> Hi, Ismael,
> > > >> > >>
> > > >> > >> For #3, typically, an admin won't configure more io threads than
> > > CPU
> > > >> > >> cores,
> > > >> > >> but it's possible for an admin to start with fewer io threads
> > than
> > > >> cores
> > > >> > >> and grow that later on.
> > > >> > >>
> > > >> > >> Hi, Dong,
> > > >> > >>
> > > >> > >> I think the throttleTime sensor on the broker tells the admin
> > > >> whether a
> > > >> > >> user/clentId is throttled or not.
> > > >> > >>
> > > >> > >> Hi, Radi,
> > > >> > >>
> > > >> > >> The reasoning for delaying the throttled requests on the broker
> > > >> instead
> > > >> > of
> > > >> > >> returning an error immediately is that the latter has no way to
> > > >> prevent
> > > >> > >> the
> > > >> > >> client from retrying immediately, which will make things worse.
> > The
> > > >> > >> delaying logic is based off a delay queue. A separate expiration
> > > >> thread
> > > >> > >> just waits on the next to be expired request. So, it doesn't tie
> > > up a
> > > >> > >> request handler thread.
> > > >> > >>
> > > >> > >> Thanks,
> > > >> > >>
> > > >> > >> Jun
> > > >> > >>
> > > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <is...@juma.me.uk>
> > > >> wrote:
> > > >> > >>
> > > >> > >> > Hi Jay,
> > > >> > >> >
> > > >> > >> > Regarding 1, I definitely like the simplicity of keeping a
> > single
> > > >> > >> throttle
> > > >> > >> > time field in the response. The downside is that the client
> > > metrics
> > > >> > >> will be
> > > >> > >> > more coarse grained.
> > > >> > >> >
> > > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.percentage`
> > > and
> > > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > >> > >> >
> > > >> > >> > Ismael
> > > >> > >> >
> > > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io>
> > > >> wrote:
> > > >> > >> >
> > > >> > >> > > A few minor comments:
> > > >> > >> > >
> > > >> > >> > >    1. Isn't it the case that the throttling time response
> > field
> > > >> > should
> > > >> > >> > have
> > > >> > >> > >    the total time your request was throttled irrespective of
> > > the
> > > >> > >> quotas
> > > >> > >> > > that
> > > >> > >> > >    caused that. Limiting it to byte rate quota doesn't make
> > > >> sense,
> > > >> > >> but I
> > > >> > >> > > also
> > > >> > >> > >    I don't think we want to end up adding new fields in the
> > > >> response
> > > >> > >> for
> > > >> > >> > > every
> > > >> > >> > >    single thing we quota, right?
> > > >> > >> > >    2. I don't think we should make this quota specifically
> > > about
> > > >> io
> > > >> > >> > >    threads. Once we introduce these quotas people set them
> > and
> > > >> > expect
> > > >> > >> > them
> > > >> > >> > > to
> > > >> > >> > >    be enforced (and if they aren't it may cause an outage).
> > As
> > > a
> > > >> > >> result
> > > >> > >> > > they
> > > >> > >> > >    are a bit more sensitive than normal configs, I think. The
> > > >> > current
> > > >> > >> > > thread
> > > >> > >> > >    pools seem like something of an implementation detail and
> > > not
> > > >> the
> > > >> > >> > level
> > > >> > >> > > the
> > > >> > >> > >    user-facing quotas should be involved with. I think it
> > might
> > > >> be
> > > >> > >> better
> > > >> > >> > > to
> > > >> > >> > >    make this a general request-time throttle with no mention
> > in
> > > >> the
> > > >> > >> > naming
> > > >> > >> > >    about I/O threads and simply acknowledge the current
> > > >> limitation
> > > >> > >> (which
> > > >> > >> > > we
> > > >> > >> > >    may someday fix) in the docs that this covers only the
> > time
> > > >> after
> > > >> > >> the
> > > >> > >> > >    thread is read off the network.
> > > >> > >> > >    3. As such I think the right interface to the user would
> > be
> > > >> > >> something
> > > >> > >> > >    like percent_request_time and be in {0,...100} or
> > > >> > >> request_time_ratio
> > > >> > >> > > and be
> > > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we
> > used
> > > >> if
> > > >> > the
> > > >> > >> > > scale
> > > >> > >> > >    is between 0 and 1 in the other metrics, right?)
> > > >> > >> > >
> > > >> > >> > > -Jay
> > > >> > >> > >
> > > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > > >> > >> rajinisivaram@gmail.com
> > > >> > >> > >
> > > >> > >> > > wrote:
> > > >> > >> > >
> > > >> > >> > > > Guozhang/Dong,
> > > >> > >> > > >
> > > >> > >> > > > Thank you for the feedback.
> > > >> > >> > > >
> > > >> > >> > > > Guozhang : I have updated the section on co-existence of
> > byte
> > > >> rate
> > > >> > >> and
> > > >> > >> > > > request time quotas.
> > > >> > >> > > >
> > > >> > >> > > > Dong: I hadn't added much detail to the metrics and sensors
> > > >> since
> > > >> > >> they
> > > >> > >> > > are
> > > >> > >> > > > going to be very similar to the existing metrics and
> > sensors.
> > > >> To
> > > >> > >> avoid
> > > >> > >> > > > confusion, I have now added more detail. All metrics are in
> > > the
> > > >> > >> group
> > > >> > >> > > > "quotaType" and all sensors have names starting with
> > > >> "quotaType"
> > > >> > >> (where
> > > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > > >> > >> > > > FollowerReplication/*IOThread*).
> > > >> > >> > > > So there will be no reuse of existing metrics/sensors. The
> > > new
> > > >> > ones
> > > >> > >> for
> > > >> > >> > > > request processing time based throttling will be completely
> > > >> > >> independent
> > > >> > >> > > of
> > > >> > >> > > > existing metrics/sensors, but will be consistent in format.
> > > >> > >> > > >
> > > >> > >> > > > The existing throttle_time_ms field in produce/fetch
> > > responses
> > > >> > will
> > > >> > >> not
> > > >> > >> > > be
> > > >> > >> > > > impacted by this KIP. That will continue to return
> > byte-rate
> > > >> based
> > > >> > >> > > > throttling times. In addition, a new field
> > > >> > request_throttle_time_ms
> > > >> > >> > will
> > > >> > >> > > be
> > > >> > >> > > > added to return request quota based throttling times. These
> > > >> will
> > > >> > be
> > > >> > >> > > exposed
> > > >> > >> > > > as new metrics on the client-side.
> > > >> > >> > > >
> > > >> > >> > > > Since all metrics and sensors are different for each type
> > of
> > > >> > quota,
> > > >> > >> I
> > > >> > >> > > > believe there is already sufficient metrics to monitor
> > > >> throttling
> > > >> > on
> > > >> > >> > both
> > > >> > >> > > > client and broker side for each type of throttling.
> > > >> > >> > > >
> > > >> > >> > > > Regards,
> > > >> > >> > > >
> > > >> > >> > > > Rajini
> > > >> > >> > > >
> > > >> > >> > > >
> > > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > > lindong28@gmail.com
> > > >> >
> > > >> > >> wrote:
> > > >> > >> > > >
> > > >> > >> > > > > Hey Rajini,
> > > >> > >> > > > >
> > > >> > >> > > > > I think it makes a lot of sense to use io_thread_units as
> > > >> metric
> > > >> > >> to
> > > >> > >> > > quota
> > > >> > >> > > > > user's traffic here. LGTM overall. I have some questions
> > > >> > regarding
> > > >> > >> > > > sensors.
> > > >> > >> > > > >
> > > >> > >> > > > > - Can you be more specific in the KIP what sensors will
> > be
> > > >> > added?
> > > >> > >> For
> > > >> > >> > > > > example, it will be useful to specify the name and
> > > >> attributes of
> > > >> > >> > these
> > > >> > >> > > > new
> > > >> > >> > > > > sensors.
> > > >> > >> > > > >
> > > >> > >> > > > > - We currently have throttle-time and queue-size for
> > > >> byte-rate
> > > >> > >> based
> > > >> > >> > > > quota.
> > > >> > >> > > > > Are you going to have separate throttle-time and
> > queue-size
> > > >> for
> > > >> > >> > > requests
> > > >> > >> > > > > throttled by io_thread_unit-based quota, or will they
> > share
> > > >> the
> > > >> > >> same
> > > >> > >> > > > > sensor?
> > > >> > >> > > > >
> > > >> > >> > > > > - Does the throttle-time in the ProduceResponse and
> > > >> > FetchResponse
> > > >> > >> > > > contains
> > > >> > >> > > > > time due to io_thread_unit-based quota?
> > > >> > >> > > > >
> > > >> > >> > > > > - Currently kafka server doesn't not provide any log or
> > > >> metrics
> > > >> > >> that
> > > >> > >> > > > tells
> > > >> > >> > > > > whether any given clientId (or user) is throttled. This
> > is
> > > >> not
> > > >> > too
> > > >> > >> > bad
> > > >> > >> > > > > because we can still check the client-side byte-rate
> > metric
> > > >> to
> > > >> > >> > validate
> > > >> > >> > > > > whether a given client is throttled. But with this
> > > >> > io_thread_unit,
> > > >> > >> > > there
> > > >> > >> > > > > will be no way to validate whether a given client is slow
> > > >> > because
> > > >> > >> it
> > > >> > >> > > has
> > > >> > >> > > > > exceeded its io_thread_unit limit. It is necessary for
> > user
> > > >> to
> > > >> > be
> > > >> > >> > able
> > > >> > >> > > to
> > > >> > >> > > > > know this information to figure how whether they have
> > > reached
> > > >> > >> there
> > > >> > >> > > quota
> > > >> > >> > > > > limit. How about we add log4j log on the server side to
> > > >> > >> periodically
> > > >> > >> > > > print
> > > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > > >> > >> > io-thread-unit-throttle-time)
> > > >> > >> > > so
> > > >> > >> > > > > that kafka administrator can figure those users that have
> > > >> > reached
> > > >> > >> > their
> > > >> > >> > > > > limit and act accordingly?
> > > >> > >> > > > >
> > > >> > >> > > > > Thanks,
> > > >> > >> > > > > Dong
> > > >> > >> > > > >
> > > >> > >> > > > >
> > > >> > >> > > > >
> > > >> > >> > > > >
> > > >> > >> > > > >
> > > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> > > >> > >> wangguoz@gmail.com>
> > > >> > >> > > > wrote:
> > > >> > >> > > > >
> > > >> > >> > > > > > Made a pass over the doc, overall LGTM except a minor
> > > >> comment
> > > >> > on
> > > >> > >> > the
> > > >> > >> > > > > > throttling implementation:
> > > >> > >> > > > > >
> > > >> > >> > > > > > Stated as "Request processing time throttling will be
> > > >> applied
> > > >> > on
> > > >> > >> > top
> > > >> > >> > > if
> > > >> > >> > > > > > necessary." I thought that it meant the request
> > > processing
> > > >> > time
> > > >> > >> > > > > throttling
> > > >> > >> > > > > > is applied first, but continue reading I found it
> > > actually
> > > >> > >> meant to
> > > >> > >> > > > apply
> > > >> > >> > > > > > produce / fetch byte rate throttling first.
> > > >> > >> > > > > >
> > > >> > >> > > > > > Also the last sentence "The remaining delay if any is
> > > >> applied
> > > >> > to
> > > >> > >> > the
> > > >> > >> > > > > > response." is a bit confusing to me. Maybe rewording
> > it a
> > > >> bit?
> > > >> > >> > > > > >
> > > >> > >> > > > > >
> > > >> > >> > > > > > Guozhang
> > > >> > >> > > > > >
> > > >> > >> > > > > >
> > > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> > > jun@confluent.io
> > > >> >
> > > >> > >> wrote:
> > > >> > >> > > > > >
> > > >> > >> > > > > > > Hi, Rajini,
> > > >> > >> > > > > > >
> > > >> > >> > > > > > > Thanks for the updated KIP. The latest proposal looks
> > > >> good
> > > >> > to
> > > >> > >> me.
> > > >> > >> > > > > > >
> > > >> > >> > > > > > > Jun
> > > >> > >> > > > > > >
> > > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > > >> > >> > > > > rajinisivaram@gmail.com
> > > >> > >> > > > > > >
> > > >> > >> > > > > > > wrote:
> > > >> > >> > > > > > >
> > > >> > >> > > > > > > > Jun/Roger,
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > > > Thank you for the feedback.
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > > > 1. I have updated the KIP to use absolute units
> > > >> instead of
> > > >> > >> > > > > percentage.
> > > >> > >> > > > > > > The
> > > >> > >> > > > > > > > property is called* io_thread_units* to align with
> > > the
> > > >> > >> thread
> > > >> > >> > > count
> > > >> > >> > > > > > > > property *num.io.threads*. When we implement
> > network
> > > >> > thread
> > > >> > >> > > > > utilization
> > > >> > >> > > > > > > > quotas, we can add another property
> > > >> > *network_thread_units.*
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > > > 2. ControlledShutdown is already listed under the
> > > >> exempt
> > > >> > >> > > requests.
> > > >> > >> > > > > Jun,
> > > >> > >> > > > > > > did
> > > >> > >> > > > > > > > you mean a different request that needs to be
> > added?
> > > >> The
> > > >> > >> four
> > > >> > >> > > > > requests
> > > >> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> > > >> > >> > ControlledShutdown,
> > > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are
> > controlled
> > > >> > using
> > > >> > >> > > > > > ClusterAction
> > > >> > >> > > > > > > > ACL, so it is easy to exclude and only throttle if
> > > >> > >> > unauthorized.
> > > >> > >> > > I
> > > >> > >> > > > > > wasn't
> > > >> > >> > > > > > > > sure if there are other requests used only for
> > > >> > inter-broker
> > > >> > >> > that
> > > >> > >> > > > > needed
> > > >> > >> > > > > > > to
> > > >> > >> > > > > > > > be excluded.
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > > > 3. I was thinking the smallest change would be to
> > > >> replace
> > > >> > >> all
> > > >> > >> > > > > > references
> > > >> > >> > > > > > > to
> > > >> > >> > > > > > > > *requestChannel.sendResponse()* with a local
> > method
> > > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the
> > > throttling
> > > >> if
> > > >> > >> any
> > > >> > >> > > plus
> > > >> > >> > > > > send
> > > >> > >> > > > > > > > response. If we throttle first in
> > > *KafkaApis.handle()*,
> > > >> > the
> > > >> > >> > time
> > > >> > >> > > > > spent
> > > >> > >> > > > > > > > within the method handling the request will not be
> > > >> > recorded
> > > >> > >> or
> > > >> > >> > > used
> > > >> > >> > > > > in
> > > >> > >> > > > > > > > throttling. We can look into this again when the PR
> > > is
> > > >> > ready
> > > >> > >> > for
> > > >> > >> > > > > > review.
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > > > Regards,
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > > > Rajini
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > > >> > >> > > > > roger.hoover@gmail.com>
> > > >> > >> > > > > > > > wrote:
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > > > > Great to see this KIP and the excellent
> > discussion.
> > > >> > >> > > > > > > > >
> > > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
> > > >> application
> > > >> > is
> > > >> > >> > > > > allocated
> > > >> > >> > > > > > 1
> > > >> > >> > > > > > > > > request handler unit, then it's as if I have a
> > > Kafka
> > > >> > >> broker
> > > >> > >> > > with
> > > >> > >> > > > a
> > > >> > >> > > > > > > single
> > > >> > >> > > > > > > > > request handler thread dedicated to me.  That's
> > the
> > > >> > most I
> > > >> > >> > can
> > > >> > >> > > > use,
> > > >> > >> > > > > > at
> > > >> > >> > > > > > > > > least.  That allocation doesn't change even if an
> > > >> admin
> > > >> > >> later
> > > >> > >> > > > > > increases
> > > >> > >> > > > > > > > the
> > > >> > >> > > > > > > > > size of the request thread pool on the broker.
> > > It's
> > > >> > >> similar
> > > >> > >> > to
> > > >> > >> > > > the
> > > >> > >> > > > > > CPU
> > > >> > >> > > > > > > > > abstraction that VMs and containers get from
> > > >> hypervisors
> > > >> > >> or
> > > >> > >> > OS
> > > >> > >> > > > > > > > schedulers.
> > > >> > >> > > > > > > > > While different client access patterns can use
> > > wildly
> > > >> > >> > different
> > > >> > >> > > > > > amounts
> > > >> > >> > > > > > > > of
> > > >> > >> > > > > > > > > request thread resources per request, a given
> > > >> > application
> > > >> > >> > will
> > > >> > >> > > > > > > generally
> > > >> > >> > > > > > > > > have a stable access pattern and can figure out
> > > >> > >> empirically
> > > >> > >> > how
> > > >> > >> > > > > many
> > > >> > >> > > > > > > > > "request thread units" it needs to meet it's
> > > >> > >> > throughput/latency
> > > >> > >> > > > > > goals.
> > > >> > >> > > > > > > > >
> > > >> > >> > > > > > > > > Cheers,
> > > >> > >> > > > > > > > >
> > > >> > >> > > > > > > > > Roger
> > > >> > >> > > > > > > > >
> > > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> > > >> > >> jun@confluent.io>
> > > >> > >> > > > wrote:
> > > >> > >> > > > > > > > >
> > > >> > >> > > > > > > > > > Hi, Rajini,
> > > >> > >> > > > > > > > > >
> > > >> > >> > > > > > > > > > Thanks for the updated KIP. A few more
> > comments.
> > > >> > >> > > > > > > > > >
> > > >> > >> > > > > > > > > > 1. A concern of request_time_percent is that
> > it's
> > > >> not
> > > >> > an
> > > >> > >> > > > absolute
> > > >> > >> > > > > > > > value.
> > > >> > >> > > > > > > > > > Let's say you give a user a 10% limit. If the
> > > admin
> > > >> > >> doubles
> > > >> > >> > > the
> > > >> > >> > > > > > > number
> > > >> > >> > > > > > > > of
> > > >> > >> > > > > > > > > > request handler threads, that user now actually
> > > has
> > > >> > >> twice
> > > >> > >> > the
> > > >> > >> > > > > > > absolute
> > > >> > >> > > > > > > > > > capacity. This may confuse people a bit. So,
> > > >> perhaps
> > > >> > >> > setting
> > > >> > >> > > > the
> > > >> > >> > > > > > > quota
> > > >> > >> > > > > > > > > > based on an absolute request thread unit is
> > > better.
> > > >> > >> > > > > > > > > >
> > > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
> > > >> inter-broker
> > > >> > >> > request
> > > >> > >> > > > and
> > > >> > >> > > > > > > needs
> > > >> > >> > > > > > > > to
> > > >> > >> > > > > > > > > > be excluded from throttling.
> > > >> > >> > > > > > > > > >
> > > >> > >> > > > > > > > > > 3. Implementation wise, I am wondering if it's
> > > >> simpler
> > > >> > >> to
> > > >> > >> > > apply
> > > >> > >> > > > > the
> > > >> > >> > > > > > > > > request
> > > >> > >> > > > > > > > > > time throttling first in KafkaApis.handle().
> > > >> > Otherwise,
> > > >> > >> we
> > > >> > >> > > will
> > > >> > >> > > > > > need
> > > >> > >> > > > > > > to
> > > >> > >> > > > > > > > > add
> > > >> > >> > > > > > > > > > the throttling logic in each type of request.
> > > >> > >> > > > > > > > > >
> > > >> > >> > > > > > > > > > Thanks,
> > > >> > >> > > > > > > > > >
> > > >> > >> > > > > > > > > > Jun
> > > >> > >> > > > > > > > > >
> > > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini
> > Sivaram <
> > > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > >> > >> > > > > > > > > >
> > > >> > >> > > > > > > > > > wrote:
> > > >> > >> > > > > > > > > >
> > > >> > >> > > > > > > > > > > Jun,
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > > > Thank you for the review.
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > > > I have reverted to the original KIP that
> > > >> throttles
> > > >> > >> based
> > > >> > >> > on
> > > >> > >> > > > > > request
> > > >> > >> > > > > > > > > > handler
> > > >> > >> > > > > > > > > > > utilization. At the moment, it uses
> > percentage,
> > > >> but
> > > >> > I
> > > >> > >> am
> > > >> > >> > > > happy
> > > >> > >> > > > > to
> > > >> > >> > > > > > > > > change
> > > >> > >> > > > > > > > > > to
> > > >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
> > > >> required. I
> > > >> > >> have
> > > >> > >> > > > added
> > > >> > >> > > > > > the
> > > >> > >> > > > > > > > > > examples
> > > >> > >> > > > > > > > > > > from this discussion to the KIP. Also added a
> > > >> > "Future
> > > >> > >> > Work"
> > > >> > >> > > > > > section
> > > >> > >> > > > > > > > to
> > > >> > >> > > > > > > > > > > address network thread utilization. The
> > > >> > configuration
> > > >> > >> is
> > > >> > >> > > > named
> > > >> > >> > > > > > > > > > > "request_time_percent" with the expectation
> > > that
> > > >> it
> > > >> > >> can
> > > >> > >> > > also
> > > >> > >> > > > be
> > > >> > >> > > > > > > used
> > > >> > >> > > > > > > > as
> > > >> > >> > > > > > > > > > the
> > > >> > >> > > > > > > > > > > limit for network thread utilization when
> > that
> > > is
> > > >> > >> > > > implemented,
> > > >> > >> > > > > so
> > > >> > >> > > > > > > > that
> > > >> > >> > > > > > > > > > > users have to set only one config for the two
> > > and
> > > >> > not
> > > >> > >> > have
> > > >> > >> > > to
> > > >> > >> > > > > > worry
> > > >> > >> > > > > > > > > about
> > > >> > >> > > > > > > > > > > the internal distribution of the work between
> > > the
> > > >> > two
> > > >> > >> > > thread
> > > >> > >> > > > > > pools
> > > >> > >> > > > > > > in
> > > >> > >> > > > > > > > > > > Kafka.
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > > > Regards,
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > > > Rajini
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
> > > >> > >> > > jun@confluent.io>
> > > >> > >> > > > > > > wrote:
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > The benefit of using the request processing
> > > >> time
> > > >> > >> over
> > > >> > >> > the
> > > >> > >> > > > > > request
> > > >> > >> > > > > > > > > rate
> > > >> > >> > > > > > > > > > is
> > > >> > >> > > > > > > > > > > > exactly what people have said. I will just
> > > >> expand
> > > >> > >> that
> > > >> > >> > a
> > > >> > >> > > > bit.
> > > >> > >> > > > > > > > > Consider
> > > >> > >> > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > following case. The producer sends a
> > produce
> > > >> > request
> > > >> > >> > > with a
> > > >> > >> > > > > > 10MB
> > > >> > >> > > > > > > > > > message
> > > >> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> > > >> > >> decompression of
> > > >> > >> > > the
> > > >> > >> > > > > > > message
> > > >> > >> > > > > > > > > on
> > > >> > >> > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > broker could take 10-15 seconds, during
> > which
> > > >> > time,
> > > >> > >> a
> > > >> > >> > > > request
> > > >> > >> > > > > > > > handler
> > > >> > >> > > > > > > > > > > > thread is completely blocked. In this case,
> > > >> > neither
> > > >> > >> the
> > > >> > >> > > > > byte-in
> > > >> > >> > > > > > > > quota
> > > >> > >> > > > > > > > > > nor
> > > >> > >> > > > > > > > > > > > the request rate quota may be effective in
> > > >> > >> protecting
> > > >> > >> > the
> > > >> > >> > > > > > broker.
> > > >> > >> > > > > > > > > > > Consider
> > > >> > >> > > > > > > > > > > > another case. A consumer group starts with
> > 10
> > > >> > >> instances
> > > >> > >> > > and
> > > >> > >> > > > > > later
> > > >> > >> > > > > > > > on
> > > >> > >> > > > > > > > > > > > switches to 20 instances. The request rate
> > > will
> > > >> > >> likely
> > > >> > >> > > > > double,
> > > >> > >> > > > > > > but
> > > >> > >> > > > > > > > > the
> > > >> > >> > > > > > > > > > > > actually load on the broker may not double
> > > >> since
> > > >> > >> each
> > > >> > >> > > fetch
> > > >> > >> > > > > > > request
> > > >> > >> > > > > > > > > > only
> > > >> > >> > > > > > > > > > > > contains half of the partitions. Request
> > rate
> > > >> > quota
> > > >> > >> may
> > > >> > >> > > not
> > > >> > >> > > > > be
> > > >> > >> > > > > > > easy
> > > >> > >> > > > > > > > > to
> > > >> > >> > > > > > > > > > > > configure in this case.
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > What we really want is to be able to
> > prevent
> > > a
> > > >> > >> client
> > > >> > >> > > from
> > > >> > >> > > > > > using
> > > >> > >> > > > > > > > too
> > > >> > >> > > > > > > > > > much
> > > >> > >> > > > > > > > > > > > of the server side resources. In this
> > > >> particular
> > > >> > >> KIP,
> > > >> > >> > > this
> > > >> > >> > > > > > > resource
> > > >> > >> > > > > > > > > is
> > > >> > >> > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > capacity of the request handler threads. I
> > > >> agree
> > > >> > >> that
> > > >> > >> > it
> > > >> > >> > > > may
> > > >> > >> > > > > > not
> > > >> > >> > > > > > > be
> > > >> > >> > > > > > > > > > > > intuitive for the users to determine how to
> > > set
> > > >> > the
> > > >> > >> > right
> > > >> > >> > > > > > limit.
> > > >> > >> > > > > > > > > > However,
> > > >> > >> > > > > > > > > > > > this is not completely new and has been
> > done
> > > in
> > > >> > the
> > > >> > >> > > > container
> > > >> > >> > > > > > > world
> > > >> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> > > >> > >> > > > > https://access.redhat.com/
> > > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > >> > >> terprise_Linux/6/html/
> > > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html)
> > has
> > > >> the
> > > >> > >> > concept
> > > >> > >> > > of
> > > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > >> > >> > > > > > > > > > > > which specifies the total amount of time in
> > > >> > >> > microseconds
> > > >> > >> > > > for
> > > >> > >> > > > > > > which
> > > >> > >> > > > > > > > > all
> > > >> > >> > > > > > > > > > > > tasks in a cgroup can run during a one
> > second
> > > >> > >> period.
> > > >> > >> > We
> > > >> > >> > > > can
> > > >> > >> > > > > > > > > > potentially
> > > >> > >> > > > > > > > > > > > model the request handler threads in a
> > > similar
> > > >> > way.
> > > >> > >> For
> > > >> > >> > > > > > example,
> > > >> > >> > > > > > > > each
> > > >> > >> > > > > > > > > > > > request handler thread can be 1 request
> > > handler
> > > >> > unit
> > > >> > >> > and
> > > >> > >> > > > the
> > > >> > >> > > > > > > admin
> > > >> > >> > > > > > > > > can
> > > >> > >> > > > > > > > > > > > configure a limit on how many units (say
> > > 0.01)
> > > >> a
> > > >> > >> client
> > > >> > >> > > can
> > > >> > >> > > > > > have.
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > Regarding not throttling the internal
> > broker
> > > to
> > > >> > >> broker
> > > >> > >> > > > > > requests.
> > > >> > >> > > > > > > We
> > > >> > >> > > > > > > > > > could
> > > >> > >> > > > > > > > > > > > do that. Alternatively, we could just let
> > the
> > > >> > admin
> > > >> > >> > > > > configure a
> > > >> > >> > > > > > > > high
> > > >> > >> > > > > > > > > > > limit
> > > >> > >> > > > > > > > > > > > for the kafka user (it may not be able to
> > do
> > > >> that
> > > >> > >> > easily
> > > >> > >> > > > > based
> > > >> > >> > > > > > on
> > > >> > >> > > > > > > > > > > clientId
> > > >> > >> > > > > > > > > > > > though).
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > Ideally we want to be able to protect the
> > > >> > >> utilization
> > > >> > >> > of
> > > >> > >> > > > the
> > > >> > >> > > > > > > > network
> > > >> > >> > > > > > > > > > > thread
> > > >> > >> > > > > > > > > > > > pool too. The difficult is mostly what
> > Rajini
> > > >> > said:
> > > >> > >> (1)
> > > >> > >> > > The
> > > >> > >> > > > > > > > mechanism
> > > >> > >> > > > > > > > > > for
> > > >> > >> > > > > > > > > > > > throttling the requests is through
> > Purgatory
> > > >> and
> > > >> > we
> > > >> > >> > will
> > > >> > >> > > > have
> > > >> > >> > > > > > to
> > > >> > >> > > > > > > > > think
> > > >> > >> > > > > > > > > > > > through how to integrate that into the
> > > network
> > > >> > >> layer.
> > > >> > >> > > (2)
> > > >> > >> > > > In
> > > >> > >> > > > > > the
> > > >> > >> > > > > > > > > > network
> > > >> > >> > > > > > > > > > > > layer, currently we know the user, but not
> > > the
> > > >> > >> clientId
> > > >> > >> > > of
> > > >> > >> > > > > the
> > > >> > >> > > > > > > > > request.
> > > >> > >> > > > > > > > > > > So,
> > > >> > >> > > > > > > > > > > > it's a bit tricky to throttle based on
> > > clientId
> > > >> > >> there.
> > > >> > >> > > > Plus,
> > > >> > >> > > > > > the
> > > >> > >> > > > > > > > > > byteOut
> > > >> > >> > > > > > > > > > > > quota can already protect the network
> > thread
> > > >> > >> > utilization
> > > >> > >> > > > for
> > > >> > >> > > > > > > fetch
> > > >> > >> > > > > > > > > > > > requests. So, if we can't figure out this
> > > part
> > > >> > right
> > > >> > >> > now,
> > > >> > >> > > > > just
> > > >> > >> > > > > > > > > focusing
> > > >> > >> > > > > > > > > > > on
> > > >> > >> > > > > > > > > > > > the request handling threads for this KIP
> > is
> > > >> > still a
> > > >> > >> > > useful
> > > >> > >> > > > > > > > feature.
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > Thanks,
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > Jun
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini
> > > >> Sivaram <
> > > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > wrote:
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > Thank you all for the feedback.
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > Jay: I have removed exemption for
> > consumer
> > > >> > >> heartbeat
> > > >> > >> > > etc.
> > > >> > >> > > > > > Agree
> > > >> > >> > > > > > > > > that
> > > >> > >> > > > > > > > > > > > > protecting the cluster is more important
> > > than
> > > >> > >> > > protecting
> > > >> > >> > > > > > > > individual
> > > >> > >> > > > > > > > > > > apps.
> > > >> > >> > > > > > > > > > > > > Have retained the exemption for
> > > >> > >> > > StopReplicat/LeaderAndIsr
> > > >> > >> > > > > > etc,
> > > >> > >> > > > > > > > > these
> > > >> > >> > > > > > > > > > > are
> > > >> > >> > > > > > > > > > > > > throttled only if authorization fails (so
> > > >> can't
> > > >> > be
> > > >> > >> > used
> > > >> > >> > > > for
> > > >> > >> > > > > > DoS
> > > >> > >> > > > > > > > > > attacks
> > > >> > >> > > > > > > > > > > > in
> > > >> > >> > > > > > > > > > > > > a secure cluster, but allows inter-broker
> > > >> > >> requests to
> > > >> > >> > > > > > complete
> > > >> > >> > > > > > > > > > without
> > > >> > >> > > > > > > > > > > > > delays).
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > I will wait another day to see if these
> > is
> > > >> any
> > > >> > >> > > objection
> > > >> > >> > > > to
> > > >> > >> > > > > > > > quotas
> > > >> > >> > > > > > > > > > > based
> > > >> > >> > > > > > > > > > > > on
> > > >> > >> > > > > > > > > > > > > request processing time (as opposed to
> > > >> request
> > > >> > >> rate)
> > > >> > >> > > and
> > > >> > >> > > > if
> > > >> > >> > > > > > > there
> > > >> > >> > > > > > > > > are
> > > >> > >> > > > > > > > > > > no
> > > >> > >> > > > > > > > > > > > > objections, I will revert to the original
> > > >> > proposal
> > > >> > >> > with
> > > >> > >> > > > > some
> > > >> > >> > > > > > > > > changes.
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > The original proposal was only including
> > > the
> > > >> > time
> > > >> > >> > used
> > > >> > >> > > by
> > > >> > >> > > > > the
> > > >> > >> > > > > > > > > request
> > > >> > >> > > > > > > > > > > > > handler threads (that made calculation
> > > >> easy). I
> > > >> > >> think
> > > >> > >> > > the
> > > >> > >> > > > > > > > > suggestion
> > > >> > >> > > > > > > > > > is
> > > >> > >> > > > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > > include the time spent in the network
> > > >> threads as
> > > >> > >> well
> > > >> > >> > > > since
> > > >> > >> > > > > > > that
> > > >> > >> > > > > > > > > may
> > > >> > >> > > > > > > > > > be
> > > >> > >> > > > > > > > > > > > > significant. As Jay pointed out, it is
> > more
> > > >> > >> > complicated
> > > >> > >> > > > to
> > > >> > >> > > > > > > > > calculate
> > > >> > >> > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > total available CPU time and convert to a
> > > >> ratio
> > > >> > >> when
> > > >> > >> > > > there
> > > >> > >> > > > > > *m*
> > > >> > >> > > > > > > > I/O
> > > >> > >> > > > > > > > > > > > threads
> > > >> > >> > > > > > > > > > > > > and *n* network threads.
> > > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > >> > >> > > )
> > > >> > >> > > > > may
> > > >> > >> > > > > > > > give
> > > >> > >> > > > > > > > > us
> > > >> > >> > > > > > > > > > > > what
> > > >> > >> > > > > > > > > > > > > we want, but it can be very expensive on
> > > some
> > > >> > >> > > platforms.
> > > >> > >> > > > As
> > > >> > >> > > > > > > > Becket
> > > >> > >> > > > > > > > > > and
> > > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have
> > > several
> > > >> > time
> > > >> > >> > > > > > measurements
> > > >> > >> > > > > > > > > > already
> > > >> > >> > > > > > > > > > > > for
> > > >> > >> > > > > > > > > > > > > generating metrics that we could use,
> > > though
> > > >> we
> > > >> > >> might
> > > >> > >> > > > want
> > > >> > >> > > > > to
> > > >> > >> > > > > > > > > switch
> > > >> > >> > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis()
> > > >> since
> > > >> > >> some
> > > >> > >> > of
> > > >> > >> > > > the
> > > >> > >> > > > > > > > values
> > > >> > >> > > > > > > > > > for
> > > >> > >> > > > > > > > > > > > > small requests may be < 1ms. But rather
> > > than
> > > >> add
> > > >> > >> up
> > > >> > >> > the
> > > >> > >> > > > > time
> > > >> > >> > > > > > > > spent
> > > >> > >> > > > > > > > > in
> > > >> > >> > > > > > > > > > > I/O
> > > >> > >> > > > > > > > > > > > > thread and network thread, wouldn't it be
> > > >> better
> > > >> > >> to
> > > >> > >> > > > convert
> > > >> > >> > > > > > the
> > > >> > >> > > > > > > > > time
> > > >> > >> > > > > > > > > > > > spent
> > > >> > >> > > > > > > > > > > > > on each thread into a separate ratio?
> > UserA
> > > >> has
> > > >> > a
> > > >> > >> > > request
> > > >> > >> > > > > > quota
> > > >> > >> > > > > > > > of
> > > >> > >> > > > > > > > > > 5%.
> > > >> > >> > > > > > > > > > > > Can
> > > >> > >> > > > > > > > > > > > > we take that to mean that UserA can use
> > 5%
> > > of
> > > >> > the
> > > >> > >> > time
> > > >> > >> > > on
> > > >> > >> > > > > > > network
> > > >> > >> > > > > > > > > > > threads
> > > >> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If
> > > either
> > > >> is
> > > >> > >> > > exceeded,
> > > >> > >> > > > > the
> > > >> > >> > > > > > > > > > response
> > > >> > >> > > > > > > > > > > is
> > > >> > >> > > > > > > > > > > > > throttled - it would mean maintaining two
> > > >> sets
> > > >> > of
> > > >> > >> > > metrics
> > > >> > >> > > > > for
> > > >> > >> > > > > > > the
> > > >> > >> > > > > > > > > two
> > > >> > >> > > > > > > > > > > > > durations, but would result in more
> > > >> meaningful
> > > >> > >> > ratios.
> > > >> > >> > > We
> > > >> > >> > > > > > could
> > > >> > >> > > > > > > > > > define
> > > >> > >> > > > > > > > > > > > two
> > > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request
> > > threads
> > > >> > and
> > > >> > >> 10%
> > > >> > >> > > of
> > > >> > >> > > > > > > network
> > > >> > >> > > > > > > > > > > > threads),
> > > >> > >> > > > > > > > > > > > > but that seems unnecessary and harder to
> > > >> explain
> > > >> > >> to
> > > >> > >> > > > users.
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > Back to why and how quotas are applied to
> > > >> > network
> > > >> > >> > > thread
> > > >> > >> > > > > > > > > utilization:
> > > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the time spent
> > in
> > > >> the
> > > >> > >> > network
> > > >> > >> > > > > > thread
> > > >> > >> > > > > > > > may
> > > >> > >> > > > > > > > > be
> > > >> > >> > > > > > > > > > > > > significant and I can see the need to
> > > include
> > > >> > >> this.
> > > >> > >> > Are
> > > >> > >> > > > > there
> > > >> > >> > > > > > > > other
> > > >> > >> > > > > > > > > > > > > requests where the network thread
> > > >> utilization is
> > > >> > >> > > > > significant?
> > > >> > >> > > > > > > In
> > > >> > >> > > > > > > > > the
> > > >> > >> > > > > > > > > > > case
> > > >> > >> > > > > > > > > > > > > of fetch, request handler thread
> > > utilization
> > > >> > would
> > > >> > >> > > > throttle
> > > >> > >> > > > > > > > clients
> > > >> > >> > > > > > > > > > > with
> > > >> > >> > > > > > > > > > > > > high request rate, low data volume and
> > > fetch
> > > >> > byte
> > > >> > >> > rate
> > > >> > >> > > > > quota
> > > >> > >> > > > > > > will
> > > >> > >> > > > > > > > > > > > throttle
> > > >> > >> > > > > > > > > > > > > clients with high data volume. Network
> > > thread
> > > >> > >> > > utilization
> > > >> > >> > > > > is
> > > >> > >> > > > > > > > > perhaps
> > > >> > >> > > > > > > > > > > > > proportional to the data volume. I am
> > > >> wondering
> > > >> > >> if we
> > > >> > >> > > > even
> > > >> > >> > > > > > need
> > > >> > >> > > > > > > > to
> > > >> > >> > > > > > > > > > > > throttle
> > > >> > >> > > > > > > > > > > > > based on network thread utilization or
> > > >> whether
> > > >> > the
> > > >> > >> > data
> > > >> > >> > > > > > volume
> > > >> > >> > > > > > > > > quota
> > > >> > >> > > > > > > > > > > > covers
> > > >> > >> > > > > > > > > > > > > this case.
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > b) At the moment, we record and check for
> > > >> quota
> > > >> > >> > > violation
> > > >> > >> > > > > at
> > > >> > >> > > > > > > the
> > > >> > >> > > > > > > > > same
> > > >> > >> > > > > > > > > > > > time.
> > > >> > >> > > > > > > > > > > > > If a quota is violated, the response is
> > > >> delayed.
> > > >> > >> > Using
> > > >> > >> > > > > Jay'e
> > > >> > >> > > > > > > > > example
> > > >> > >> > > > > > > > > > of
> > > >> > >> > > > > > > > > > > > > disk reads for fetches happening in the
> > > >> network
> > > >> > >> > thread,
> > > >> > >> > > > We
> > > >> > >> > > > > > > can't
> > > >> > >> > > > > > > > > > record
> > > >> > >> > > > > > > > > > > > and
> > > >> > >> > > > > > > > > > > > > delay a response after the disk reads. We
> > > >> could
> > > >> > >> > record
> > > >> > >> > > > the
> > > >> > >> > > > > > time
> > > >> > >> > > > > > > > > spent
> > > >> > >> > > > > > > > > > > on
> > > >> > >> > > > > > > > > > > > > the network thread when the response is
> > > >> complete
> > > >> > >> and
> > > >> > >> > > > > > introduce
> > > >> > >> > > > > > > a
> > > >> > >> > > > > > > > > > delay
> > > >> > >> > > > > > > > > > > > for
> > > >> > >> > > > > > > > > > > > > handling a subsequent request (separate
> > out
> > > >> > >> recording
> > > >> > >> > > and
> > > >> > >> > > > > > quota
> > > >> > >> > > > > > > > > > > violation
> > > >> > >> > > > > > > > > > > > > handling in the case of network thread
> > > >> > overload).
> > > >> > >> > Does
> > > >> > >> > > > that
> > > >> > >> > > > > > > make
> > > >> > >> > > > > > > > > > sense?
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > Regards,
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > Rajini
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket
> > > Qin <
> > > >> > >> > > > > > > > becket.qin@gmail.com>
> > > >> > >> > > > > > > > > > > > wrote:
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > >> > >> > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU
> > time
> > > >> is a
> > > >> > >> > little
> > > >> > >> > > > > > > tricky. I
> > > >> > >> > > > > > > > > am
> > > >> > >> > > > > > > > > > > > > thinking
> > > >> > >> > > > > > > > > > > > > > that maybe we can use the existing
> > > request
> > > >> > >> > > statistics.
> > > >> > >> > > > > They
> > > >> > >> > > > > > > are
> > > >> > >> > > > > > > > > > > already
> > > >> > >> > > > > > > > > > > > > > very detailed so we can probably see
> > the
> > > >> > >> > approximate
> > > >> > >> > > > CPU
> > > >> > >> > > > > > time
> > > >> > >> > > > > > > > > from
> > > >> > >> > > > > > > > > > > it,
> > > >> > >> > > > > > > > > > > > > e.g.
> > > >> > >> > > > > > > > > > > > > > something like (total_time -
> > > >> > >> > > > request/response_queue_time
> > > >> > >> > > > > -
> > > >> > >> > > > > > > > > > > > remote_time).
> > > >> > >> > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > I agree with Guozhang that when a user
> > is
> > > >> > >> throttled
> > > >> > >> > > it
> > > >> > >> > > > is
> > > >> > >> > > > > > > > likely
> > > >> > >> > > > > > > > > > that
> > > >> > >> > > > > > > > > > > > we
> > > >> > >> > > > > > > > > > > > > > need to see if anything has went wrong
> > > >> first,
> > > >> > >> and
> > > >> > >> > if
> > > >> > >> > > > the
> > > >> > >> > > > > > > users
> > > >> > >> > > > > > > > > are
> > > >> > >> > > > > > > > > > > well
> > > >> > >> > > > > > > > > > > > > > behaving and just need more resources,
> > we
> > > >> will
> > > >> > >> have
> > > >> > >> > > to
> > > >> > >> > > > > bump
> > > >> > >> > > > > > > up
> > > >> > >> > > > > > > > > the
> > > >> > >> > > > > > > > > > > > quota
> > > >> > >> > > > > > > > > > > > > > for them. It is true that
> > pre-allocating
> > > >> CPU
> > > >> > >> time
> > > >> > >> > > quota
> > > >> > >> > > > > > > > precisely
> > > >> > >> > > > > > > > > > for
> > > >> > >> > > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > > users is difficult. So in practice it
> > > would
> > > >> > >> > probably
> > > >> > >> > > be
> > > >> > >> > > > > > more
> > > >> > >> > > > > > > > like
> > > >> > >> > > > > > > > > > > first
> > > >> > >> > > > > > > > > > > > > set
> > > >> > >> > > > > > > > > > > > > > a relative high protective CPU time
> > quota
> > > >> for
> > > >> > >> > > everyone
> > > >> > >> > > > > and
> > > >> > >> > > > > > > > > increase
> > > >> > >> > > > > > > > > > > > that
> > > >> > >> > > > > > > > > > > > > > for some individual clients on demand.
> > > >> > >> > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > Thanks,
> > > >> > >> > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > >> > >> > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM,
> > Guozhang
> > > >> > Wang <
> > > >> > >> > > > > > > > > wangguoz@gmail.com
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > wrote:
> > > >> > >> > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > This is a great proposal, glad to see
> > > it
> > > >> > >> > happening.
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU throttling,
> > or
> > > >> more
> > > >> > >> > > > > specifically
> > > >> > >> > > > > > > > > > > processing
> > > >> > >> > > > > > > > > > > > > time
> > > >> > >> > > > > > > > > > > > > > > ratio instead of the request rate
> > > >> throttling
> > > >> > >> as
> > > >> > >> > > well.
> > > >> > >> > > > > > > Becket
> > > >> > >> > > > > > > > > has
> > > >> > >> > > > > > > > > > > very
> > > >> > >> > > > > > > > > > > > > > well
> > > >> > >> > > > > > > > > > > > > > > summed my rationales above, and one
> > > >> thing to
> > > >> > >> add
> > > >> > >> > > here
> > > >> > >> > > > > is
> > > >> > >> > > > > > > that
> > > >> > >> > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > former
> > > >> > >> > > > > > > > > > > > > > > has a good support for both
> > "protecting
> > > >> > >> against
> > > >> > >> > > rogue
> > > >> > >> > > > > > > > clients"
> > > >> > >> > > > > > > > > as
> > > >> > >> > > > > > > > > > > > well
> > > >> > >> > > > > > > > > > > > > as
> > > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> > multi-tenancy
> > > >> > usage":
> > > >> > >> > when
> > > >> > >> > > > > > > thinking
> > > >> > >> > > > > > > > > > about
> > > >> > >> > > > > > > > > > > > how
> > > >> > >> > > > > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > > > > explain this to the end users, I find
> > > it
> > > >> > >> actually
> > > >> > >> > > > more
> > > >> > >> > > > > > > > natural
> > > >> > >> > > > > > > > > > than
> > > >> > >> > > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > > > request rate since as mentioned
> > above,
> > > >> > >> different
> > > >> > >> > > > > requests
> > > >> > >> > > > > > > > will
> > > >> > >> > > > > > > > > > have
> > > >> > >> > > > > > > > > > > > > quite
> > > >> > >> > > > > > > > > > > > > > > different "cost", and Kafka today
> > > already
> > > >> > have
> > > >> > >> > > > various
> > > >> > >> > > > > > > > request
> > > >> > >> > > > > > > > > > > types
> > > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata,
> > etc),
> > > >> > >> because
> > > >> > >> > of
> > > >> > >> > > > that
> > > >> > >> > > > > > the
> > > >> > >> > > > > > > > > > request
> > > >> > >> > > > > > > > > > > > > rate
> > > >> > >> > > > > > > > > > > > > > > throttling may not be as effective
> > > >> unless it
> > > >> > >> is
> > > >> > >> > set
> > > >> > >> > > > > very
> > > >> > >> > > > > > > > > > > > > conservatively.
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > Regarding to user reactions when they
> > > are
> > > >> > >> > > throttled,
> > > >> > >> > > > I
> > > >> > >> > > > > > > think
> > > >> > >> > > > > > > > it
> > > >> > >> > > > > > > > > > may
> > > >> > >> > > > > > > > > > > > > > differ
> > > >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
> > > discovered /
> > > >> > >> guided
> > > >> > >> > by
> > > >> > >> > > > > > looking
> > > >> > >> > > > > > > > at
> > > >> > >> > > > > > > > > > > > relative
> > > >> > >> > > > > > > > > > > > > > > metrics. So in other words users
> > would
> > > >> not
> > > >> > >> expect
> > > >> > >> > > to
> > > >> > >> > > > > get
> > > >> > >> > > > > > > > > > additional
> > > >> > >> > > > > > > > > > > > > > > information by simply being told
> > "hey,
> > > >> you
> > > >> > are
> > > >> > >> > > > > > throttled",
> > > >> > >> > > > > > > > > which
> > > >> > >> > > > > > > > > > is
> > > >> > >> > > > > > > > > > > > all
> > > >> > >> > > > > > > > > > > > > > > what throttling does; they need to
> > > take a
> > > >> > >> > follow-up
> > > >> > >> > > > > step
> > > >> > >> > > > > > > and
> > > >> > >> > > > > > > > > see
> > > >> > >> > > > > > > > > > > > "hmm,
> > > >> > >> > > > > > > > > > > > > > I'm
> > > >> > >> > > > > > > > > > > > > > > throttled probably because of ..",
> > > which
> > > >> is
> > > >> > by
> > > >> > >> > > > looking
> > > >> > >> > > > > at
> > > >> > >> > > > > > > > other
> > > >> > >> > > > > > > > > > > > metric
> > > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding
> > the
> > > >> > >> brokers
> > > >> > >> > > with
> > > >> > >> > > > > > > metadata
> > > >> > >> > > > > > > > > > > > request,
> > > >> > >> > > > > > > > > > > > > > > which are usually cheap to handle but
> > > I'm
> > > >> > >> sending
> > > >> > >> > > > > > thousands
> > > >> > >> > > > > > > > per
> > > >> > >> > > > > > > > > > > > second;
> > > >> > >> > > > > > > > > > > > > > or
> > > >> > >> > > > > > > > > > > > > > > is it because I'm catching up and
> > hence
> > > >> > >> sending
> > > >> > >> > > very
> > > >> > >> > > > > > heavy
> > > >> > >> > > > > > > > > > fetching
> > > >> > >> > > > > > > > > > > > > > request
> > > >> > >> > > > > > > > > > > > > > > with large min.bytes, etc.
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > Regarding to the implementation, as
> > > once
> > > >> > >> > discussed
> > > >> > >> > > > with
> > > >> > >> > > > > > > Jun,
> > > >> > >> > > > > > > > > this
> > > >> > >> > > > > > > > > > > > seems
> > > >> > >> > > > > > > > > > > > > > not
> > > >> > >> > > > > > > > > > > > > > > very difficult since today we are
> > > already
> > > >> > >> > > collecting
> > > >> > >> > > > > the
> > > >> > >> > > > > > > > > "thread
> > > >> > >> > > > > > > > > > > pool
> > > >> > >> > > > > > > > > > > > > > > utilization" metrics, which is a
> > single
> > > >> > >> > percentage
> > > >> > >> > > > > > > > > > > > "aggregateIdleMeter"
> > > >> > >> > > > > > > > > > > > > > > value; but we are already effectively
> > > >> > >> aggregating
> > > >> > >> > > it
> > > >> > >> > > > > for
> > > >> > >> > > > > > > each
> > > >> > >> > > > > > > > > > > > requests
> > > >> > >> > > > > > > > > > > > > in
> > > >> > >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just
> > > >> extend
> > > >> > >> it by
> > > >> > >> > > > > > recording
> > > >> > >> > > > > > > > the
> > > >> > >> > > > > > > > > > > > source
> > > >> > >> > > > > > > > > > > > > > > client id when handling them and
> > > >> aggregating
> > > >> > >> by
> > > >> > >> > > > > clientId
> > > >> > >> > > > > > as
> > > >> > >> > > > > > > > > well
> > > >> > >> > > > > > > > > > as
> > > >> > >> > > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > > > total aggregate.
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > Guozhang
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay
> > > >> Kreps <
> > > >> > >> > > > > > > jay@confluent.io
> > > >> > >> > > > > > > > >
> > > >> > >> > > > > > > > > > > wrote:
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
> > > >> > >> > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > When I thought about it more
> > deeply I
> > > >> came
> > > >> > >> > around
> > > >> > >> > > > to
> > > >> > >> > > > > > the
> > > >> > >> > > > > > > > > > "percent
> > > >> > >> > > > > > > > > > > > of
> > > >> > >> > > > > > > > > > > > > > > > processing time" metric too. It
> > > seems a
> > > >> > lot
> > > >> > >> > > closer
> > > >> > >> > > > to
> > > >> > >> > > > > > the
> > > >> > >> > > > > > > > > thing
> > > >> > >> > > > > > > > > > > we
> > > >> > >> > > > > > > > > > > > > > > actually
> > > >> > >> > > > > > > > > > > > > > > > care about and need to protect. I
> > > also
> > > >> > think
> > > >> > >> > this
> > > >> > >> > > > > would
> > > >> > >> > > > > > > be
> > > >> > >> > > > > > > > a
> > > >> > >> > > > > > > > > > very
> > > >> > >> > > > > > > > > > > > > > useful
> > > >> > >> > > > > > > > > > > > > > > > metric even in the absence of
> > > >> throttling
> > > >> > >> just
> > > >> > >> > to
> > > >> > >> > > > > debug
> > > >> > >> > > > > > > > whose
> > > >> > >> > > > > > > > > > > using
> > > >> > >> > > > > > > > > > > > > > > > capacity.
> > > >> > >> > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > Two problems to consider:
> > > >> > >> > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > >    1. I agree that for the user it
> > is
> > > >> > >> > > > understandable
> > > >> > >> > > > > > what
> > > >> > >> > > > > > > > > lead
> > > >> > >> > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > > their
> > > >> > >> > > > > > > > > > > > > > > >    being throttled, but it is a bit
> > > >> hard
> > > >> > to
> > > >> > >> > > figure
> > > >> > >> > > > > out
> > > >> > >> > > > > > > the
> > > >> > >> > > > > > > > > safe
> > > >> > >> > > > > > > > > > > > range
> > > >> > >> > > > > > > > > > > > > > for
> > > >> > >> > > > > > > > > > > > > > > >    them. i.e. if I have a new app
> > > that
> > > >> > will
> > > >> > >> > send
> > > >> > >> > > > 200
> > > >> > >> > > > > > > > > > > messages/sec I
> > > >> > >> > > > > > > > > > > > > can
> > > >> > >> > > > > > > > > > > > > > > >    probably reason that I'll be
> > under
> > > >> the
> > > >> > >> > > > throttling
> > > >> > >> > > > > > > limit
> > > >> > >> > > > > > > > of
> > > >> > >> > > > > > > > > > 300
> > > >> > >> > > > > > > > > > > > > > > req/sec.
> > > >> > >> > > > > > > > > > > > > > > >    However if I need to be under a
> > > 10%
> > > >> CPU
> > > >> > >> > > > resources
> > > >> > >> > > > > > > limit
> > > >> > >> > > > > > > > it
> > > >> > >> > > > > > > > > > may
> > > >> > >> > > > > > > > > > > > be
> > > >> > >> > > > > > > > > > > > > a
> > > >> > >> > > > > > > > > > > > > > > bit
> > > >> > >> > > > > > > > > > > > > > > >    harder for me to know a priori
> > if
> > > i
> > > >> > will
> > > >> > >> or
> > > >> > >> > > > won't.
> > > >> > >> > > > > > > > > > > > > > > >    2. Calculating the available CPU
> > > >> time
> > > >> > is
> > > >> > >> a
> > > >> > >> > bit
> > > >> > >> > > > > > > difficult
> > > >> > >> > > > > > > > > > since
> > > >> > >> > > > > > > > > > > > > there
> > > >> > >> > > > > > > > > > > > > > > are
> > > >> > >> > > > > > > > > > > > > > > >    actually two thread pools--the
> > I/O
> > > >> > >> threads
> > > >> > >> > and
> > > >> > >> > > > the
> > > >> > >> > > > > > > > network
> > > >> > >> > > > > > > > > > > > > threads.
> > > >> > >> > > > > > > > > > > > > > I
> > > >> > >> > > > > > > > > > > > > > > > think
> > > >> > >> > > > > > > > > > > > > > > >    it might be workable to count
> > just
> > > >> the
> > > >> > >> I/O
> > > >> > >> > > > thread
> > > >> > >> > > > > > time
> > > >> > >> > > > > > > > as
> > > >> > >> > > > > > > > > in
> > > >> > >> > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > > > > proposal,
> > > >> > >> > > > > > > > > > > > > > > >    but the network thread work is
> > > >> actually
> > > >> > >> > > > > non-trivial
> > > >> > >> > > > > > > > (e.g.
> > > >> > >> > > > > > > > > > all
> > > >> > >> > > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > > disk
> > > >> > >> > > > > > > > > > > > > > > >    reads for fetches happen in that
> > > >> > >> thread). If
> > > >> > >> > > you
> > > >> > >> > > > > > count
> > > >> > >> > > > > > > > > both
> > > >> > >> > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > > > network
> > > >> > >> > > > > > > > > > > > > > > > and
> > > >> > >> > > > > > > > > > > > > > > >    I/O threads it can skew things a
> > > >> bit.
> > > >> > >> E.g.
> > > >> > >> > say
> > > >> > >> > > > you
> > > >> > >> > > > > > > have
> > > >> > >> > > > > > > > 50
> > > >> > >> > > > > > > > > > > > network
> > > >> > >> > > > > > > > > > > > > > > > threads,
> > > >> > >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores,
> > what
> > > is
> > > >> > the
> > > >> > >> > > > available
> > > >> > >> > > > > > cpu
> > > >> > >> > > > > > > > > time
> > > >> > >> > > > > > > > > > > > > > available
> > > >> > >> > > > > > > > > > > > > > > > in a
> > > >> > >> > > > > > > > > > > > > > > >    second? I suppose this is a
> > > problem
> > > >> > >> whenever
> > > >> > >> > > you
> > > >> > >> > > > > > have
> > > >> > >> > > > > > > a
> > > >> > >> > > > > > > > > > > > bottleneck
> > > >> > >> > > > > > > > > > > > > > > > between
> > > >> > >> > > > > > > > > > > > > > > >    I/O and network threads or if
> > you
> > > >> end
> > > >> > up
> > > >> > >> > > > > > significantly
> > > >> > >> > > > > > > > > > > > > > > over-provisioning
> > > >> > >> > > > > > > > > > > > > > > >    one pool (both of which are hard
> > > to
> > > >> > >> avoid).
> > > >> > >> > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > An alternative for CPU throttling
> > > >> would be
> > > >> > >> to
> > > >> > >> > use
> > > >> > >> > > > > this
> > > >> > >> > > > > > > api:
> > > >> > >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> > > >> > >> > > > > > 1.5.0/docs/api/java/lang/
> > > >> > >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
> > > >> > >> > > > getThreadCpuTime(long)
> > > >> > >> > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > That would let you track actual CPU
> > > >> usage
> > > >> > >> > across
> > > >> > >> > > > the
> > > >> > >> > > > > > > > network,
> > > >> > >> > > > > > > > > > I/O
> > > >> > >> > > > > > > > > > > > > > > threads,
> > > >> > >> > > > > > > > > > > > > > > > and purgatory threads and look at
> > it
> > > >> as a
> > > >> > >> > > > percentage
> > > >> > >> > > > > of
> > > >> > >> > > > > > > > total
> > > >> > >> > > > > > > > > > > > cores.
> > > >> > >> > > > > > > > > > > > > I
> > > >> > >> > > > > > > > > > > > > > > > think this fixes many problems in
> > the
> > > >> > >> > reliability
> > > >> > >> > > > of
> > > >> > >> > > > > > the
> > > >> > >> > > > > > > > > > metric.
> > > >> > >> > > > > > > > > > > > It's
> > > >> > >> > > > > > > > > > > > > > > > meaning is slightly different as it
> > > is
> > > >> > just
> > > >> > >> CPU
> > > >> > >> > > > (you
> > > >> > >> > > > > > > don't
> > > >> > >> > > > > > > > > get
> > > >> > >> > > > > > > > > > > > > charged
> > > >> > >> > > > > > > > > > > > > > > for
> > > >> > >> > > > > > > > > > > > > > > > time blocking on I/O) but that may
> > be
> > > >> okay
> > > >> > >> > > because
> > > >> > >> > > > we
> > > >> > >> > > > > > > > already
> > > >> > >> > > > > > > > > > > have
> > > >> > >> > > > > > > > > > > > a
> > > >> > >> > > > > > > > > > > > > > > > throttle on I/O. The downside is I
> > > >> think
> > > >> > it
> > > >> > >> is
> > > >> > >> > > > > possible
> > > >> > >> > > > > > > > this
> > > >> > >> > > > > > > > > > api
> > > >> > >> > > > > > > > > > > > can
> > > >> > >> > > > > > > > > > > > > be
> > > >> > >> > > > > > > > > > > > > > > > disabled or isn't always available
> > > and
> > > >> it
> > > >> > >> may
> > > >> > >> > > also
> > > >> > >> > > > be
> > > >> > >> > > > > > > > > expensive
> > > >> > >> > > > > > > > > > > > (also
> > > >> > >> > > > > > > > > > > > > > > I've
> > > >> > >> > > > > > > > > > > > > > > > never used it so not sure if it
> > > really
> > > >> > works
> > > >> > >> > the
> > > >> > >> > > > way
> > > >> > >> > > > > i
> > > >> > >> > > > > > > > > think).
> > > >> > >> > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > -Jay
> > > >> > >> > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM,
> > > Becket
> > > >> > Qin
> > > >> > >> <
> > > >> > >> > > > > > > > > > > becket.qin@gmail.com>
> > > >> > >> > > > > > > > > > > > > > > wrote:
> > > >> > >> > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > If the purpose of the KIP is only
> > > to
> > > >> > >> protect
> > > >> > >> > > the
> > > >> > >> > > > > > > cluster
> > > >> > >> > > > > > > > > from
> > > >> > >> > > > > > > > > > > > being
> > > >> > >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and
> > is
> > > >> not
> > > >> > >> > > intended
> > > >> > >> > > > to
> > > >> > >> > > > > > > > address
> > > >> > >> > > > > > > > > > > > > resource
> > > >> > >> > > > > > > > > > > > > > > > > allocation problem among the
> > > >> clients, I
> > > >> > am
> > > >> > >> > > > > wondering
> > > >> > >> > > > > > if
> > > >> > >> > > > > > > > > using
> > > >> > >> > > > > > > > > > > > > request
> > > >> > >> > > > > > > > > > > > > > > > > handling time quota (CPU time
> > > quota)
> > > >> is
> > > >> > a
> > > >> > >> > > better
> > > >> > >> > > > > > > option.
> > > >> > >> > > > > > > > > Here
> > > >> > >> > > > > > > > > > > are
> > > >> > >> > > > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > > > > > reasons:
> > > >> > >> > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > 1. request handling time quota
> > has
> > > >> > better
> > > >> > >> > > > > protection.
> > > >> > >> > > > > > > Say
> > > >> > >> > > > > > > > > we
> > > >> > >> > > > > > > > > > > have
> > > >> > >> > > > > > > > > > > > > > > request
> > > >> > >> > > > > > > > > > > > > > > > > rate quota and set that to some
> > > value
> > > >> > like
> > > >> > >> > 100
> > > >> > >> > > > > > > > > requests/sec,
> > > >> > >> > > > > > > > > > it
> > > >> > >> > > > > > > > > > > > is
> > > >> > >> > > > > > > > > > > > > > > > possible
> > > >> > >> > > > > > > > > > > > > > > > > that some of the requests are
> > very
> > > >> > >> expensive
> > > >> > >> > > > > actually
> > > >> > >> > > > > > > > take
> > > >> > >> > > > > > > > > a
> > > >> > >> > > > > > > > > > > lot
> > > >> > >> > > > > > > > > > > > of
> > > >> > >> > > > > > > > > > > > > > > time
> > > >> > >> > > > > > > > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > > > > > > handle. In that case a few
> > clients
> > > >> may
> > > >> > >> still
> > > >> > >> > > > > occupy a
> > > >> > >> > > > > > > lot
> > > >> > >> > > > > > > > > of
> > > >> > >> > > > > > > > > > > CPU
> > > >> > >> > > > > > > > > > > > > time
> > > >> > >> > > > > > > > > > > > > > > > even
> > > >> > >> > > > > > > > > > > > > > > > > the request rate is low. Arguably
> > > we
> > > >> can
> > > >> > >> > > > carefully
> > > >> > >> > > > > > set
> > > >> > >> > > > > > > > > > request
> > > >> > >> > > > > > > > > > > > rate
> > > >> > >> > > > > > > > > > > > > > > quota
> > > >> > >> > > > > > > > > > > > > > > > > for each request and client id
> > > >> > >> combination,
> > > >> > >> > but
> > > >> > >> > > > it
> > > >> > >> > > > > > > could
> > > >> > >> > > > > > > > > > still
> > > >> > >> > > > > > > > > > > be
> > > >> > >> > > > > > > > > > > > > > > tricky
> > > >> > >> > > > > > > > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > > > > > > get it right for everyone.
> > > >> > >> > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > If we use the request time
> > handling
> > > >> > >> quota, we
> > > >> > >> > > can
> > > >> > >> > > > > > > simply
> > > >> > >> > > > > > > > > say
> > > >> > >> > > > > > > > > > no
> > > >> > >> > > > > > > > > > > > > > clients
> > > >> > >> > > > > > > > > > > > > > > > can
> > > >> > >> > > > > > > > > > > > > > > > > take up to more than 30% of the
> > > total
> > > >> > >> request
> > > >> > >> > > > > > handling
> > > >> > >> > > > > > > > > > capacity
> > > >> > >> > > > > > > > > > > > > > > (measured
> > > >> > >> > > > > > > > > > > > > > > > > by time), regardless of the
> > > >> difference
> > > >> > >> among
> > > >> > >> > > > > > different
> > > >> > >> > > > > > > > > > requests
> > > >> > >> > > > > > > > > > > > or
> > > >> > >> > > > > > > > > > > > > > what
> > > >> > >> > > > > > > > > > > > > > > > is
> > > >> > >> > > > > > > > > > > > > > > > > the client doing. In this case
> > > maybe
> > > >> we
> > > >> > >> can
> > > >> > >> > > quota
> > > >> > >> > > > > all
> > > >> > >> > > > > > > the
> > > >> > >> > > > > > > > > > > > requests
> > > >> > >> > > > > > > > > > > > > if
> > > >> > >> > > > > > > > > > > > > > > we
> > > >> > >> > > > > > > > > > > > > > > > > want to.
> > > >> > >> > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > 2. The main benefit of using
> > > request
> > > >> > rate
> > > >> > >> > limit
> > > >> > >> > > > is
> > > >> > >> > > > > > that
> > > >> > >> > > > > > > > it
> > > >> > >> > > > > > > > > > > seems
> > > >> > >> > > > > > > > > > > > > more
> > > >> > >> > > > > > > > > > > > > > > > > intuitive. It is true that it is
> > > >> > probably
> > > >> > >> > > easier
> > > >> > >> > > > to
> > > >> > >> > > > > > > > explain
> > > >> > >> > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > > user
> > > >> > >> > > > > > > > > > > > > > > > > what does that mean. However, in
> > > >> > practice
> > > >> > >> it
> > > >> > >> > > > looks
> > > >> > >> > > > > > the
> > > >> > >> > > > > > > > > impact
> > > >> > >> > > > > > > > > > > of
> > > >> > >> > > > > > > > > > > > > > > request
> > > >> > >> > > > > > > > > > > > > > > > > rate quota is not more
> > quantifiable
> > > >> than
> > > >> > >> the
> > > >> > >> > > > > request
> > > >> > >> > > > > > > > > handling
> > > >> > >> > > > > > > > > > > > time
> > > >> > >> > > > > > > > > > > > > > > quota.
> > > >> > >> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is
> > > >> still
> > > >> > >> > > difficult
> > > >> > >> > > > > to
> > > >> > >> > > > > > > > give a
> > > >> > >> > > > > > > > > > > > number
> > > >> > >> > > > > > > > > > > > > > > about
> > > >> > >> > > > > > > > > > > > > > > > > impact of throughput or latency
> > > when
> > > >> a
> > > >> > >> > request
> > > >> > >> > > > rate
> > > >> > >> > > > > > > quota
> > > >> > >> > > > > > > > > is
> > > >> > >> > > > > > > > > > > hit.
> > > >> > >> > > > > > > > > > > > > So
> > > >> > >> > > > > > > > > > > > > > it
> > > >> > >> > > > > > > > > > > > > > > > is
> > > >> > >> > > > > > > > > > > > > > > > > not better than the request
> > > handling
> > > >> > time
> > > >> > >> > > quota.
> > > >> > >> > > > In
> > > >> > >> > > > > > > fact
> > > >> > >> > > > > > > > I
> > > >> > >> > > > > > > > > > feel
> > > >> > >> > > > > > > > > > > > it
> > > >> > >> > > > > > > > > > > > > is
> > > >> > >> > > > > > > > > > > > > > > > > clearer to tell user that "you
> > are
> > > >> > limited
> > > >> > >> > > > because
> > > >> > >> > > > > > you
> > > >> > >> > > > > > > > have
> > > >> > >> > > > > > > > > > > taken
> > > >> > >> > > > > > > > > > > > > 30%
> > > >> > >> > > > > > > > > > > > > > > of
> > > >> > >> > > > > > > > > > > > > > > > > the CPU time on the broker" than
> > > >> > otherwise
> > > >> > >> > > > > something
> > > >> > >> > > > > > > like
> > > >> > >> > > > > > > > > > "your
> > > >> > >> > > > > > > > > > > > > > request
> > > >> > >> > > > > > > > > > > > > > > > > rate quota on metadata request
> > has
> > > >> > >> reached".
> > > >> > >> > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > Thanks,
> > > >> > >> > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > >> > >> > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM,
> > > Jay
> > > >> > >> Kreps <
> > > >> > >> > > > > > > > > jay@confluent.io
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > wrote:
> > > >> > >> > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > I think this proposal makes a
> > lot
> > > >> of
> > > >> > >> sense
> > > >> > >> > > > > > > (especially
> > > >> > >> > > > > > > > > now
> > > >> > >> > > > > > > > > > > that
> > > >> > >> > > > > > > > > > > > > it
> > > >> > >> > > > > > > > > > > > > > is
> > > >> > >> > > > > > > > > > > > > > > > > > oriented around request rate)
> > and
> > > >> > fills
> > > >> > >> the
> > > >> > >> > > > > biggest
> > > >> > >> > > > > > > > > > remaining
> > > >> > >> > > > > > > > > > > > gap
> > > >> > >> > > > > > > > > > > > > > in
> > > >> > >> > > > > > > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > > > > > > multi-tenancy story.
> > > >> > >> > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > I think for intra-cluster
> > > >> > communication
> > > >> > >> > > > > > (StopReplica,
> > > >> > >> > > > > > > > > etc)
> > > >> > >> > > > > > > > > > we
> > > >> > >> > > > > > > > > > > > > could
> > > >> > >> > > > > > > > > > > > > > > > avoid
> > > >> > >> > > > > > > > > > > > > > > > > > throttling entirely. You can
> > > >> secure or
> > > >> > >> > > > otherwise
> > > >> > >> > > > > > > > > lock-down
> > > >> > >> > > > > > > > > > > the
> > > >> > >> > > > > > > > > > > > > > > cluster
> > > >> > >> > > > > > > > > > > > > > > > > > communication to avoid any
> > > >> > unauthorized
> > > >> > >> > > > external
> > > >> > >> > > > > > > party
> > > >> > >> > > > > > > > > from
> > > >> > >> > > > > > > > > > > > > trying
> > > >> > >> > > > > > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > > > > > > > initiate these requests. As a
> > > >> result
> > > >> > we
> > > >> > >> are
> > > >> > >> > > as
> > > >> > >> > > > > > likely
> > > >> > >> > > > > > > > to
> > > >> > >> > > > > > > > > > > cause
> > > >> > >> > > > > > > > > > > > > > > problems
> > > >> > >> > > > > > > > > > > > > > > > > as
> > > >> > >> > > > > > > > > > > > > > > > > > solve them by throttling these,
> > > >> right?
> > > >> > >> > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > I'm not so sure that we should
> > > >> exempt
> > > >> > >> the
> > > >> > >> > > > > consumer
> > > >> > >> > > > > > > > > requests
> > > >> > >> > > > > > > > > > > > such
> > > >> > >> > > > > > > > > > > > > as
> > > >> > >> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we
> > > >> > >> throttle an
> > > >> > >> > > > app's
> > > >> > >> > > > > > > > > heartbeat
> > > >> > >> > > > > > > > > > > > > > requests
> > > >> > >> > > > > > > > > > > > > > > it
> > > >> > >> > > > > > > > > > > > > > > > > may
> > > >> > >> > > > > > > > > > > > > > > > > > cause it to fall out of its
> > > >> consumer
> > > >> > >> group.
> > > >> > >> > > > > However
> > > >> > >> > > > > > > if
> > > >> > >> > > > > > > > we
> > > >> > >> > > > > > > > > > > don't
> > > >> > >> > > > > > > > > > > > > > > > throttle
> > > >> > >> > > > > > > > > > > > > > > > > it
> > > >> > >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the
> > > >> > heartbeat
> > > >> > >> > > > interval
> > > >> > >> > > > > > is
> > > >> > >> > > > > > > > set
> > > >> > >> > > > > > > > > > > > > > incorrectly
> > > >> > >> > > > > > > > > > > > > > > or
> > > >> > >> > > > > > > > > > > > > > > > > if
> > > >> > >> > > > > > > > > > > > > > > > > > some client in some language
> > has
> > > a
> > > >> > bug.
> > > >> > >> I
> > > >> > >> > > think
> > > >> > >> > > > > the
> > > >> > >> > > > > > > > > policy
> > > >> > >> > > > > > > > > > > with
> > > >> > >> > > > > > > > > > > > > > this
> > > >> > >> > > > > > > > > > > > > > > > kind
> > > >> > >> > > > > > > > > > > > > > > > > > of throttling is to protect the
> > > >> > cluster
> > > >> > >> > above
> > > >> > >> > > > any
> > > >> > >> > > > > > > > > > individual
> > > >> > >> > > > > > > > > > > > app,
> > > >> > >> > > > > > > > > > > > > > > > right?
> > > >> > >> > > > > > > > > > > > > > > > > I
> > > >> > >> > > > > > > > > > > > > > > > > > think in general this should be
> > > >> okay
> > > >> > >> since
> > > >> > >> > > for
> > > >> > >> > > > > most
> > > >> > >> > > > > > > > > > > deployments
> > > >> > >> > > > > > > > > > > > > > this
> > > >> > >> > > > > > > > > > > > > > > > > > setting is meant as more of a
> > > >> safety
> > > >> > >> > > > valve---that
> > > >> > >> > > > > > is
> > > >> > >> > > > > > > > > rather
> > > >> > >> > > > > > > > > > > > than
> > > >> > >> > > > > > > > > > > > > > set
> > > >> > >> > > > > > > > > > > > > > > > > > something very close to what
> > you
> > > >> > expect
> > > >> > >> to
> > > >> > >> > > need
> > > >> > >> > > > > > (say
> > > >> > >> > > > > > > 2
> > > >> > >> > > > > > > > > > > req/sec
> > > >> > >> > > > > > > > > > > > or
> > > >> > >> > > > > > > > > > > > > > > > > whatever)
> > > >> > >> > > > > > > > > > > > > > > > > > you would have something quite
> > > high
> > > >> > >> (like
> > > >> > >> > 100
> > > >> > >> > > > > > > req/sec)
> > > >> > >> > > > > > > > > with
> > > >> > >> > > > > > > > > > > > this
> > > >> > >> > > > > > > > > > > > > > > meant
> > > >> > >> > > > > > > > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I
> > > >> think
> > > >> > >> when
> > > >> > >> > > used
> > > >> > >> > > > > this
> > > >> > >> > > > > > > way
> > > >> > >> > > > > > > > > > > > allowing
> > > >> > >> > > > > > > > > > > > > > > those
> > > >> > >> > > > > > > > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > > > > > > > be throttled would actually
> > > provide
> > > >> > >> > > meaningful
> > > >> > >> > > > > > > > > protection.
> > > >> > >> > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > -Jay
> > > >> > >> > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05
> > AM,
> > > >> > Rajini
> > > >> > >> > > > Sivaram <
> > > >> > >> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > > >> > >> > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > wrote:
> > > >> > >> > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > > Hi all,
> > > >> > >> > > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > > I have just created KIP-124
> > to
> > > >> > >> introduce
> > > >> > >> > > > > request
> > > >> > >> > > > > > > rate
> > > >> > >> > > > > > > > > > > quotas
> > > >> > >> > > > > > > > > > > > to
> > > >> > >> > > > > > > > > > > > > > > > Kafka:
> > > >> > >> > > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
> > > >> > >> > > > > > > > confluence/display/KAFKA/KIP-
> > > >> > >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > > >> > >> > > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > > The proposal is for a simple
> > > >> > >> percentage
> > > >> > >> > > > request
> > > >> > >> > > > > > > > > handling
> > > >> > >> > > > > > > > > > > time
> > > >> > >> > > > > > > > > > > > > > quota
> > > >> > >> > > > > > > > > > > > > > > > > that
> > > >> > >> > > > > > > > > > > > > > > > > > > can be allocated to
> > > >> *<client-id>*,
> > > >> > >> > *<user>*
> > > >> > >> > > > or
> > > >> > >> > > > > > > > *<user,
> > > >> > >> > > > > > > > > > > > > > client-id>*.
> > > >> > >> > > > > > > > > > > > > > > > > There
> > > >> > >> > > > > > > > > > > > > > > > > > > are a few other suggestions
> > > also
> > > >> > under
> > > >> > >> > > > > "Rejected
> > > >> > >> > > > > > > > > > > > alternatives".
> > > >> > >> > > > > > > > > > > > > > > > > Feedback
> > > >> > >> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
> > > >> > >> > > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > > Thank you...
> > > >> > >> > > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > > Regards,
> > > >> > >> > > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > > > Rajini
> > > >> > >> > > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > > > --
> > > >> > >> > > > > > > > > > > > > > > -- Guozhang
> > > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > > >
> > > >> > >> > > > > > > > > > > >
> > > >> > >> > > > > > > > > > >
> > > >> > >> > > > > > > > > >
> > > >> > >> > > > > > > > >
> > > >> > >> > > > > > > >
> > > >> > >> > > > > > >
> > > >> > >> > > > > >
> > > >> > >> > > > > >
> > > >> > >> > > > > >
> > > >> > >> > > > > > --
> > > >> > >> > > > > > -- Guozhang
> > > >> > >> > > > > >
> > > >> > >> > > > >
> > > >> > >> > > >
> > > >> > >> > >
> > > >> > >> >
> > > >> > >>
> > > >> > >
> > > >> > >
> > > >> >
> > > >>
> > > >
> > > >
> > >
> >

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Ismael Juma <is...@juma.me.uk>.
Thanks for the updates, Rajini, they look good to me.

Ismael

On Thu, Mar 16, 2017 at 3:55 PM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Thank you, Jun.
>
> Many thanks to everyone for the feedback and suggestions so far. If there
> are any other suggestions or concerns, please do raise them on this thread.
> Otherwise, I will start the vote early next week.
>
> Regards,
>
> Rajini
>
>
> On Thu, Mar 16, 2017 at 11:48 AM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Rajini,
> >
> > Thanks for the updated KIP. It looks good to me now. Perhaps we can wait
> > for a couple of more days to see if there are more comments and then
> start
> > the vote?
> >
> > Jun
> >
> > On Thu, Mar 16, 2017 at 6:35 AM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > wrote:
> >
> > > Jun,
> > >
> > > 50. Yes, that makes sense. I have updated the KIP.
> > >
> > > Thank you,
> > >
> > > Rajini
> > >
> > > On Mon, Mar 13, 2017 at 7:35 PM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Rajini,
> > > >
> > > > Thanks for the updated KIP. Looks good. Just one more thing.
> > > >
> > > > 50. "Two new metrics request-throttle-time-max and
> > > > request-throttle-time-min
> > > >  will be added to reflect total request processing time based
> > throttling
> > > > for all request types including produce/fetch." The most important
> > > clients
> > > > are producer and consumer, which already have the
> > > > produce/fetch-throttle-time-min/max
> > > > metrics. Should we just accumulate the throttled time for other
> > requests
> > > > into these two existing metrics, instead of introducing new ones? We
> > can
> > > > probably add a similar metric for the admin client later on.
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Thu, Mar 9, 2017 at 2:24 PM, Rajini Sivaram <
> > rajinisivaram@gmail.com>
> > > > wrote:
> > > >
> > > > > Jun,
> > > > >
> > > > > 40. Yes you are right, a single value tracking the total exempt
> time
> > is
> > > > > sufficient. Have updated the KIP.
> > > > >
> > > > > Thank you,
> > > > >
> > > > > Rajini
> > > > >
> > > > > On Thu, Mar 9, 2017 at 9:42 PM, Jun Rao <ju...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Rajini,
> > > > > >
> > > > > > The updated KIP looks good. Just one more comment.
> > > > > >
> > > > > > 40. "An additional metric exempt-request-time will also be added
> > for
> > > > each
> > > > > > quota entity for the quota type Request." Should that metric be
> > added
> > > > for
> > > > > > each entity type (e.g., user, client-id, etc)? It seems that
> value
> > is
> > > > > > independent of entity types.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Thu, Mar 9, 2017 at 12:07 PM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi Jun,
> > > > > > >
> > > > > > > Thank you for reviewing the KIP again.
> > > > > > >
> > > > > > > 30. That is a good idea. In fact, it is one of the advantages
> of
> > > > > > measuring
> > > > > > > overall utilization rather than separate values for network and
> > I/O
> > > > > > threads
> > > > > > > as I had intended initially. Have updated the KIP, thanks.
> > > > > > >
> > > > > > > 31. Added exempt-request-time metric.
> > > > > > >
> > > > > > > 32. I had thought of using quota.window.size.seconds *
> > > > quota.window.num
> > > > > > > initially, but felt that would be too big. Even the default of
> 11
> > > > > seconds
> > > > > > > is a rather long time to be throttled. With a limit of
> > > > > > > quota.window.size.seconds, subsequent requests for that total
> > > > interval
> > > > > of
> > > > > > > the samples will also each be throttled for
> > > quota.window.size.seconds
> > > > > if
> > > > > > > the time recorded was very high. So limiting at
> > > > > quota.window.size.seconds
> > > > > > > limits the throttle time for an individual request, avoiding
> > > timeouts
> > > > > > where
> > > > > > > possible, but still throttles over a period of time.
> > > > > > >
> > > > > > > 33. Updated to use request_percentage.
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Mar 9, 2017 at 5:40 PM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > > > > >
> > > > > > > > Hi, Rajini,
> > > > > > > >
> > > > > > > > Thanks for the updated KIP. A few more comments.
> > > > > > > >
> > > > > > > > 30. Should we just account for the time in network threads in
> > > this
> > > > > KIP
> > > > > > > too?
> > > > > > > > The issue with doing this later is that existing quotas may
> be
> > > too
> > > > > > small
> > > > > > > > and everyone will have to adjust them before upgrading, which
> > is
> > > > > > > > inconvenient. If we just do the delaying in the io threads,
> > there
> > > > > > > probably
> > > > > > > > isn't too much additional work to include the network thread
> > > time?
> > > > > > > >
> > > > > > > > 31. It would be useful for the new metrics to capture the
> > > > utilization
> > > > > > of
> > > > > > > > all those requests exempt from request throttling (under sth
> > like
> > > > > > > > "exempt"). It's useful for an admin to know how much time is
> > > spent
> > > > > > there
> > > > > > > > too.
> > > > > > > >
> > > > > > > > 32. "The maximum throttle time for any single request will be
> > the
> > > > > quota
> > > > > > > > window size (one second by default)." We probably should cap
> > the
> > > > > delay
> > > > > > at
> > > > > > > > quota.window.size.seconds * quota.window.num?
> > > > > > > >
> > > > > > > > 33. It's unfortunate that we use . in configs and _ in ZK
> data
> > > > > > > structures.
> > > > > > > > However, for consistency, request.percentage in ZK probably
> > > should
> > > > be
> > > > > > > > request_percentage?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Thu, Mar 9, 2017 at 7:55 AM, Rajini Sivaram <
> > > > > > rajinisivaram@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > I have updated the KIP to use "request.percentage" quotas
> > where
> > > > the
> > > > > > > > > percentage is out of a total of (num.io.threads * 100). I
> > have
> > > > > added
> > > > > > > the
> > > > > > > > > other options considered so far under "Rejected
> > Alternatives".
> > > > > > > > >
> > > > > > > > > To address Todd's concern about per-thread quotas: Even
> > though
> > > > the
> > > > > > > quotas
> > > > > > > > > are out of (num.io.threads * 100)  clients are not locked
> > into
> > > > > > threads.
> > > > > > > > > Utilization is measured as the total across all the I/O
> > threads
> > > > and
> > > > > > 10
> > > > > > > %
> > > > > > > > > quota can be 1% of 10 threads. Individual quotas can also
> be
> > > > > greater
> > > > > > > than
> > > > > > > > > 100% if required.
> > > > > > > > >
> > > > > > > > > Please let me know if there are any other concerns or
> > > > suggestions.
> > > > > > > > >
> > > > > > > > > Thank you,
> > > > > > > > >
> > > > > > > > > Rajini
> > > > > > > > >
> > > > > > > > > On Wed, Mar 8, 2017 at 10:20 PM, Todd Palino <
> > > tpalino@gmail.com>
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Rajini -
> > > > > > > > > >
> > > > > > > > > > I understand what you’re saying, but the point I’m making
> > is
> > > > > that I
> > > > > > > > don’t
> > > > > > > > > > believe we need to take it into account directly. The CPU
> > > > > > utilization
> > > > > > > > of
> > > > > > > > > > the network threads is directly proportional to the
> number
> > of
> > > > > bytes
> > > > > > > > being
> > > > > > > > > > sent. The more bytes, the more CPU that is required for
> SSL
> > > (or
> > > > > > other
> > > > > > > > > > tasks). This is opposed to the request handler threads,
> > where
> > > > > there
> > > > > > > > are a
> > > > > > > > > > number of factors that affect CPU utilization. This means
> > > that
> > > > > it’s
> > > > > > > not
> > > > > > > > > > necessary to separately quota network thread byte usage
> and
> > > > CPU -
> > > > > > if
> > > > > > > we
> > > > > > > > > > quota byte usage (which we already do), we have fixed the
> > CPU
> > > > > usage
> > > > > > > at
> > > > > > > > a
> > > > > > > > > > proportional amount.
> > > > > > > > > >
> > > > > > > > > > Jun -
> > > > > > > > > >
> > > > > > > > > > Thanks for the clarification there. I was thinking of the
> > > > > > utilization
> > > > > > > > > > percentage as being fixed, not what the percentage
> > reflects.
> > > > I’m
> > > > > > not
> > > > > > > > tied
> > > > > > > > > > to either way of doing it, provided that we do not lock
> > > clients
> > > > > to
> > > > > > a
> > > > > > > > > single
> > > > > > > > > > thread. For example, if I specify that a given client can
> > use
> > > > 10%
> > > > > > of
> > > > > > > a
> > > > > > > > > > single thread, that should also mean they can use 1% on
> 10
> > > > > threads.
> > > > > > > > > >
> > > > > > > > > > -Todd
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <
> jun@confluent.io>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, Todd,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the feedback.
> > > > > > > > > > >
> > > > > > > > > > > I just want to clarify your second point. If the limit
> > > > > percentage
> > > > > > > is
> > > > > > > > > per
> > > > > > > > > > > thread and the thread counts are changed, the absolute
> > > > > processing
> > > > > > > > limit
> > > > > > > > > > for
> > > > > > > > > > > existing users haven't changed and there is no need to
> > > adjust
> > > > > > them.
> > > > > > > > On
> > > > > > > > > > the
> > > > > > > > > > > other hand, if the limit percentage is of total thread
> > pool
> > > > > > > capacity
> > > > > > > > > and
> > > > > > > > > > > the thread counts are changed, the effective processing
> > > limit
> > > > > > for a
> > > > > > > > > user
> > > > > > > > > > > will change. So, to preserve the current processing
> > limit,
> > > > > > existing
> > > > > > > > > user
> > > > > > > > > > > limits have to be adjusted. If there is a hardware
> > change,
> > > > the
> > > > > > > > > effective
> > > > > > > > > > > processing limit for a user will change in either
> > approach
> > > > and
> > > > > > the
> > > > > > > > > > existing
> > > > > > > > > > > limit may need to be adjusted. However, hardware
> changes
> > > are
> > > > > less
> > > > > > > > > common
> > > > > > > > > > > than thread pool configuration changes.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <
> > > > tpalino@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > I’ve been following this one on and off, and overall
> it
> > > > > sounds
> > > > > > > good
> > > > > > > > > to
> > > > > > > > > > > me.
> > > > > > > > > > > >
> > > > > > > > > > > > - The SSL question is a good one. However, that type
> of
> > > > > > overhead
> > > > > > > > > should
> > > > > > > > > > > be
> > > > > > > > > > > > proportional to the bytes rate, so I think that a
> bytes
> > > > rate
> > > > > > > quota
> > > > > > > > > > would
> > > > > > > > > > > > still be a suitable way to address it.
> > > > > > > > > > > >
> > > > > > > > > > > > - I think it’s better to make the quota percentage of
> > > total
> > > > > > > thread
> > > > > > > > > pool
> > > > > > > > > > > > capacity, and not percentage of an individual thread.
> > > That
> > > > > way
> > > > > > > you
> > > > > > > > > > don’t
> > > > > > > > > > > > have to adjust it when you adjust thread counts
> > (tuning,
> > > > > > hardware
> > > > > > > > > > > changes,
> > > > > > > > > > > > etc.)
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > -Todd
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <
> > > > > > becket.qin@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > I see. Good point about SSL.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I just asked Todd to take a look.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <
> > > > jun@confluent.io>
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi, Jiangjie,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Yes, I agree that byte rate already protects the
> > > > network
> > > > > > > > threads
> > > > > > > > > > > > > > indirectly. I am not sure if byte rate fully
> > captures
> > > > the
> > > > > > CPU
> > > > > > > > > > > overhead
> > > > > > > > > > > > in
> > > > > > > > > > > > > > network due to SSL. So, at the high level, we can
> > use
> > > > > > request
> > > > > > > > > time
> > > > > > > > > > > > limit
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > protect CPU and use byte rate to protect storage
> > and
> > > > > > network.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Also, do you think you can get Todd to comment on
> > > this
> > > > > KIP?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <
> > > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi Rajini/Jun,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The percentage based reasoning sounds good.
> > > > > > > > > > > > > > > One thing I am wondering is that if we assume
> the
> > > > > network
> > > > > > > > > thread
> > > > > > > > > > > are
> > > > > > > > > > > > > just
> > > > > > > > > > > > > > > doing the network IO, can we say bytes rate
> quota
> > > is
> > > > > > > already
> > > > > > > > > sort
> > > > > > > > > > > of
> > > > > > > > > > > > > > > network threads quota?
> > > > > > > > > > > > > > > If we take network threads into the
> consideration
> > > > here,
> > > > > > > would
> > > > > > > > > > that
> > > > > > > > > > > be
> > > > > > > > > > > > > > > somewhat overlapping with the bytes rate quota?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini
> Sivaram <
> > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you for the explanation, I hadn't
> > realized
> > > > you
> > > > > > > meant
> > > > > > > > > > > > percentage
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > the total thread pool. If everyone is OK with
> > > Jun's
> > > > > > > > > > suggestion, I
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > update the KIP.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <
> > > > > > > jun@confluent.io>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Let's take your example. Let's say a user
> > sets
> > > > the
> > > > > > > limit
> > > > > > > > to
> > > > > > > > > > > 50%.
> > > > > > > > > > > > I
> > > > > > > > > > > > > am
> > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > sure if it's better to apply the same
> > > percentage
> > > > > > > > separately
> > > > > > > > > > to
> > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > io thread pool. For example, for produce
> > > > requests,
> > > > > > most
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > time
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > spent in the io threads whereas for fetch
> > > > requests,
> > > > > > > most
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > be in the network threads. So, using the
> same
> > > > > > > percentage
> > > > > > > > in
> > > > > > > > > > > both
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > pools means one of the pools' resource will
> > be
> > > > over
> > > > > > > > > > allocated.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > An alternative way is to simply model
> network
> > > and
> > > > > io
> > > > > > > > thread
> > > > > > > > > > > pool
> > > > > > > > > > > > > > > > together.
> > > > > > > > > > > > > > > > > If you get 10 io threads and 5 network
> > threads,
> > > > you
> > > > > > get
> > > > > > > > > 1500%
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > processing power. A 50% limit means a total
> > of
> > > > 750%
> > > > > > > > > > processing
> > > > > > > > > > > > > power.
> > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > just add up the time a user request spent
> in
> > > > either
> > > > > > > > network
> > > > > > > > > > or
> > > > > > > > > > > io
> > > > > > > > > > > > > > > thread.
> > > > > > > > > > > > > > > > > If that total exceeds 750% (doesn't matter
> > > > whether
> > > > > > it's
> > > > > > > > > spent
> > > > > > > > > > > > more
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > network or io thread), the request will be
> > > > > throttled.
> > > > > > > > This
> > > > > > > > > > > seems
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > general and is not sensitive to the current
> > > > > > > > implementation
> > > > > > > > > > > detail
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > having
> > > > > > > > > > > > > > > > > a separate network and io thread pool. In
> the
> > > > > future,
> > > > > > > if
> > > > > > > > > the
> > > > > > > > > > > > > > threading
> > > > > > > > > > > > > > > > > model changes, the same concept of quota
> can
> > > > still
> > > > > be
> > > > > > > > > > applied.
> > > > > > > > > > > > For
> > > > > > > > > > > > > > now,
> > > > > > > > > > > > > > > > > since it's a bit tricky to add the delay
> > logic
> > > in
> > > > > the
> > > > > > > > > network
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > pool,
> > > > > > > > > > > > > > > > > we could probably just do the delaying only
> > in
> > > > the
> > > > > io
> > > > > > > > > threads
> > > > > > > > > > > as
> > > > > > > > > > > > > you
> > > > > > > > > > > > > > > > > suggested earlier.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > There is still the orthogonal question of
> > > > whether a
> > > > > > > quota
> > > > > > > > > of
> > > > > > > > > > > 50%
> > > > > > > > > > > > is
> > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > 100% or 100% * #total processing threads.
> My
> > > > > feeling
> > > > > > is
> > > > > > > > > that
> > > > > > > > > > > the
> > > > > > > > > > > > > > latter
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > slightly better based on my explanation
> > > earlier.
> > > > > The
> > > > > > > way
> > > > > > > > to
> > > > > > > > > > > > > describe
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > quota to the users can be "share of elapsed
> > > > request
> > > > > > > > > > processing
> > > > > > > > > > > > time
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > single CPU" (similar to top).
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini
> > Sivaram
> > > <
> > > > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Agree about the two scenarios.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > But still not sure about a single quota
> > > > covering
> > > > > > both
> > > > > > > > > > network
> > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > I/O threads with per-thread quota. If
> there
> > > are
> > > > > 10
> > > > > > > I/O
> > > > > > > > > > > threads
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > 5
> > > > > > > > > > > > > > > > > > network threads and I want to assign half
> > the
> > > > > quota
> > > > > > > to
> > > > > > > > > > userA,
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > would be 750%. I imagine, internally, we
> > > would
> > > > > > > convert
> > > > > > > > > this
> > > > > > > > > > > to
> > > > > > > > > > > > > 500%
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > and 250% for network threads to allocate
> > 50%
> > > of
> > > > > > each
> > > > > > > > > pool.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > A couple of scenarios:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 1. Admin adds 1 extra network thread. To
> > > retain
> > > > > > 50%,
> > > > > > > > > admin
> > > > > > > > > > > > needs
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > now
> > > > > > > > > > > > > > > > > > allocate 800% for each user. Or increase
> > the
> > > > > quota
> > > > > > > for
> > > > > > > > a
> > > > > > > > > > few
> > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > To
> > > > > > > > > > > > > > > > > me,
> > > > > > > > > > > > > > > > > > it feels like admin needs to convert 50%
> to
> > > > 800%
> > > > > > and
> > > > > > > > > Kafka
> > > > > > > > > > > > > > internally
> > > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > > to convert 800% to (500%, 300%). Everyone
> > > using
> > > > > > just
> > > > > > > > 50%
> > > > > > > > > > > feels
> > > > > > > > > > > > a
> > > > > > > > > > > > > > lot
> > > > > > > > > > > > > > > > > > simpler.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2. We decide to add some other thread to
> > this
> > > > > list.
> > > > > > > > Admin
> > > > > > > > > > > needs
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > know
> > > > > > > > > > > > > > > > > > exactly how many threads form the maximum
> > > > quota.
> > > > > > And
> > > > > > > we
> > > > > > > > > can
> > > > > > > > > > > be
> > > > > > > > > > > > > > > changing
> > > > > > > > > > > > > > > > > > this between broker versions as we add
> more
> > > to
> > > > > the
> > > > > > > > list.
> > > > > > > > > > > Again
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > overall percent would be a lot simpler.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > There were others who were unconvinced
> by a
> > > > > single
> > > > > > > > > percent
> > > > > > > > > > > from
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > initial
> > > > > > > > > > > > > > > > > > proposal and were happier with thread
> units
> > > > > similar
> > > > > > > to
> > > > > > > > > CPU
> > > > > > > > > > > > units,
> > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > am
> > > > > > > > > > > > > > > > > > ok with going with per-thread quotas (as
> > > units
> > > > or
> > > > > > > > > percent).
> > > > > > > > > > > > Just
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > sure
> > > > > > > > > > > > > > > > > > it makes it easier for admin in all
> cases.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <
> > > > > > > > > jun@confluent.io>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Consider modeling as n * 100% unit. For
> > 2),
> > > > the
> > > > > > > > > question
> > > > > > > > > > is
> > > > > > > > > > > > > > what's
> > > > > > > > > > > > > > > > > > causing
> > > > > > > > > > > > > > > > > > > the I/O threads to be saturated. It's
> > > > unlikely
> > > > > > that
> > > > > > > > all
> > > > > > > > > > > > users'
> > > > > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > > > have increased at the same. A more
> likely
> > > > case
> > > > > is
> > > > > > > > that
> > > > > > > > > a
> > > > > > > > > > > few
> > > > > > > > > > > > > > > isolated
> > > > > > > > > > > > > > > > > > > users' utilization have increased. If
> so,
> > > > after
> > > > > > > > > > increasing
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > number
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > threads, the admin just needs to adjust
> > the
> > > > > quota
> > > > > > > > for a
> > > > > > > > > > few
> > > > > > > > > > > > > > > isolated
> > > > > > > > > > > > > > > > > > users,
> > > > > > > > > > > > > > > > > > > which is expected and is less work.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Consider modeling as 1 * 100% unit. For
> > 1),
> > > > all
> > > > > > > > users'
> > > > > > > > > > > quota
> > > > > > > > > > > > > need
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > adjusted, which is unexpected and is
> more
> > > > work.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > So, to me, the n * 100% model seems
> more
> > > > > > > convenient.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > As for future extension to cover
> network
> > > > thread
> > > > > > > > > > > utilization,
> > > > > > > > > > > > I
> > > > > > > > > > > > > > was
> > > > > > > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > > > > > > that one way is to simply model the
> > > capacity
> > > > as
> > > > > > (n
> > > > > > > +
> > > > > > > > > m) *
> > > > > > > > > > > > 100%
> > > > > > > > > > > > > > > unit,
> > > > > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > > > > n and m are the number of network and
> i/o
> > > > > > threads,
> > > > > > > > > > > > > respectively.
> > > > > > > > > > > > > > > > Then,
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > each user, we can just add up the
> > > utilization
> > > > > in
> > > > > > > the
> > > > > > > > > > > network
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > i/o
> > > > > > > > > > > > > > > > > > > thread. If we do this, we don't need a
> > new
> > > > type
> > > > > > of
> > > > > > > > > quota.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini
> > > > > Sivaram <
> > > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > If we use request.percentage as the
> > > > > percentage
> > > > > > > used
> > > > > > > > > in
> > > > > > > > > > a
> > > > > > > > > > > > > single
> > > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > > > > the total percentage being allocated
> > will
> > > > be
> > > > > > > > > > > > num.io.threads *
> > > > > > > > > > > > > > 100
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > > > threads and num.network.threads * 100
> > for
> > > > > > network
> > > > > > > > > > > threads.
> > > > > > > > > > > > A
> > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > > > covering the two as a percentage
> > wouldn't
> > > > > quite
> > > > > > > > work
> > > > > > > > > if
> > > > > > > > > > > you
> > > > > > > > > > > > > > want
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > allocate the same proportion in both
> > > cases.
> > > > > If
> > > > > > we
> > > > > > > > > want
> > > > > > > > > > to
> > > > > > > > > > > > > treat
> > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > > separate units, won't we need two
> quota
> > > > > > > > > configurations
> > > > > > > > > > > > > > regardless
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > whether we use units or percentage?
> > > > Perhaps I
> > > > > > > > > > > misunderstood
> > > > > > > > > > > > > > your
> > > > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > I think there are two cases:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >    1. The use case that you mentioned
> > > where
> > > > > an
> > > > > > > > admin
> > > > > > > > > is
> > > > > > > > > > > > > adding
> > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > >    and decides to add more I/O
> threads
> > > and
> > > > > > > expects
> > > > > > > > to
> > > > > > > > > > > find
> > > > > > > > > > > > > free
> > > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >    allocate for new users.
> > > > > > > > > > > > > > > > > > > >    2. Admin adds more I/O threads
> > because
> > > > the
> > > > > > I/O
> > > > > > > > > > threads
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > saturated
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >    there are cores available to
> > allocate,
> > > > > even
> > > > > > > > though
> > > > > > > > > > the
> > > > > > > > > > > > > > number
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > If we allocated treated I/O threads
> as
> > a
> > > > > single
> > > > > > > > unit
> > > > > > > > > of
> > > > > > > > > > > > 100%,
> > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > > > > quotas need to be reallocated for 1).
> > If
> > > we
> > > > > > > > allocated
> > > > > > > > > > I/O
> > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > n
> > > > > > > > > > > > > > > > > > > > units with n*100%, all user quotas
> need
> > > to
> > > > be
> > > > > > > > > > reallocated
> > > > > > > > > > > > for
> > > > > > > > > > > > > > 2),
> > > > > > > > > > > > > > > > > > > otherwise
> > > > > > > > > > > > > > > > > > > > some of the new threads may just not
> be
> > > > used.
> > > > > > > > Either
> > > > > > > > > > way
> > > > > > > > > > > it
> > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > easy
> > > > > > > > > > > > > > > > > > > > to write a script to
> decrease/increase
> > > > quotas
> > > > > > by
> > > > > > > a
> > > > > > > > > > > multiple
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > So it really boils down to which
> quota
> > > unit
> > > > > is
> > > > > > > most
> > > > > > > > > > > > intuitive
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > terms
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > configuration. And from the
> discussion
> > so
> > > > > far,
> > > > > > it
> > > > > > > > > feels
> > > > > > > > > > > > like
> > > > > > > > > > > > > > > > opinion
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > divided on whether quotas should be
> > > carved
> > > > > out
> > > > > > of
> > > > > > > > an
> > > > > > > > > > > > absolute
> > > > > > > > > > > > > > > 100%
> > > > > > > > > > > > > > > > > (or
> > > > > > > > > > > > > > > > > > 1
> > > > > > > > > > > > > > > > > > > > unit) or be relative to the number of
> > > > threads
> > > > > > > > (n*100%
> > > > > > > > > > or
> > > > > > > > > > > n
> > > > > > > > > > > > > > > units).
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun
> > Rao <
> > > > > > > > > > > jun@confluent.io>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Another way to express an absolute
> > > limit
> > > > is
> > > > > > to
> > > > > > > > use
> > > > > > > > > > > > > > > > > > request.percentage,
> > > > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > > treat it as the percentage used in
> a
> > > > single
> > > > > > > > request
> > > > > > > > > > > > > handling
> > > > > > > > > > > > > > > > > thread.
> > > > > > > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > > > > > > now, the request handling threads
> can
> > > be
> > > > > just
> > > > > > > the
> > > > > > > > > io
> > > > > > > > > > > > > threads.
> > > > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > future, they can cover the network
> > > > threads
> > > > > as
> > > > > > > > well.
> > > > > > > > > > > This
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > > top reports CPU usage and may be a
> > bit
> > > > > easier
> > > > > > > for
> > > > > > > > > > > people
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > understand.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM,
> Jun
> > > > Rao <
> > > > > > > > > > > > > jun@confluent.io>
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > 2. Regarding request.unit vs
> > > > > > > > request.percentage.
> > > > > > > > > I
> > > > > > > > > > > > > started
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > > request.percentage too. The
> > reasoning
> > > > for
> > > > > > > > > > > request.unit
> > > > > > > > > > > > is
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > following.
> > > > > > > > > > > > > > > > > > > > > > Suppose that the capacity has
> been
> > > > > reached
> > > > > > > on a
> > > > > > > > > > > broker
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > > > > > > to add a new user. A simple way
> to
> > > > > increase
> > > > > > > the
> > > > > > > > > > > > capacity
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > increase
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > > number of io threads, assuming
> > there
> > > > are
> > > > > > > still
> > > > > > > > > > enough
> > > > > > > > > > > > > > cores.
> > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > > > > > is based on percentage, the
> > > additional
> > > > > > > capacity
> > > > > > > > > > > > > > automatically
> > > > > > > > > > > > > > > > > gets
> > > > > > > > > > > > > > > > > > > > > > distributed to existing users and
> > we
> > > > > > haven't
> > > > > > > > > really
> > > > > > > > > > > > > carved
> > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > > > > additional resource for the new
> > user.
> > > > > Now,
> > > > > > is
> > > > > > > > it
> > > > > > > > > > easy
> > > > > > > > > > > > > for a
> > > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > reason
> > > > > > > > > > > > > > > > > > > > > > about 0.1 unit vs 10%. My feeling
> > is
> > > > that
> > > > > > > both
> > > > > > > > > are
> > > > > > > > > > > hard
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > > configured empirically. Not sure
> if
> > > > > > > percentage
> > > > > > > > is
> > > > > > > > > > > > > obviously
> > > > > > > > > > > > > > > > > easier
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > > reason about.
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM,
> > Jay
> > > > > Kreps
> > > > > > <
> > > > > > > > > > > > > > jay@confluent.io
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> 1. Even though the
> implementation
> > of
> > > > > this
> > > > > > > > quota
> > > > > > > > > is
> > > > > > > > > > > > only
> > > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > > >> time, i think we should call it
> > > > > something
> > > > > > > like
> > > > > > > > > > > > > > > "request-time".
> > > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > > >> give us flexibility to improve
> the
> > > > > > > > > implementation
> > > > > > > > > > to
> > > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > > > >> in the future and will avoid
> > > exposing
> > > > > > > internal
> > > > > > > > > > > details
> > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> 2. Jun/Roger, I get what you are
> > > > trying
> > > > > to
> > > > > > > fix
> > > > > > > > > but
> > > > > > > > > > > the
> > > > > > > > > > > > > > idea
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> thread/units
> > > > > > > > > > > > > > > > > > > > > >> is super unintuitive as a
> > > user-facing
> > > > > > knob.
> > > > > > > I
> > > > > > > > > had
> > > > > > > > > > to
> > > > > > > > > > > > > read
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > > > >> eight times to understand this.
> > I'm
> > > > not
> > > > > > sure
> > > > > > > > > that
> > > > > > > > > > > your
> > > > > > > > > > > > > > point
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> increasing the number of threads
> > is
> > > a
> > > > > > > problem
> > > > > > > > > > with a
> > > > > > > > > > > > > > > > > > > percentage-based
> > > > > > > > > > > > > > > > > > > > > >> value, it really depends on
> > whether
> > > > the
> > > > > > user
> > > > > > > > > > thinks
> > > > > > > > > > > > > about
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > "percentage
> > > > > > > > > > > > > > > > > > > > > >> of request processing time" or
> > > "thread
> > > > > > > units".
> > > > > > > > > If
> > > > > > > > > > > they
> > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > "I
> > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > > >> allocated 10% of my request
> > > processing
> > > > > > time
> > > > > > > to
> > > > > > > > > > user
> > > > > > > > > > > x"
> > > > > > > > > > > > > > then
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > bug
> > > > > > > > > > > > > > > > > > > > > >> that increasing the thread count
> > > > > decreases
> > > > > > > > that
> > > > > > > > > > > > percent
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> current proposal. As a practical
> > > > matter
> > > > > I
> > > > > > > > think
> > > > > > > > > > the
> > > > > > > > > > > > only
> > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > actually
> > > > > > > > > > > > > > > > > > > > > >> reason about this is as a
> > > percent---I
> > > > > just
> > > > > > > > don't
> > > > > > > > > > > > believe
> > > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > going
> > > > > > > > > > > > > > > > > > > > > >> to think, "ah, 4.3 thread units,
> > > that
> > > > is
> > > > > > the
> > > > > > > > > right
> > > > > > > > > > > > > > amount!".
> > > > > > > > > > > > > > > > > > > Instead I
> > > > > > > > > > > > > > > > > > > > > >> think they have to understand
> this
> > > > > thread
> > > > > > > unit
> > > > > > > > > > > > concept,
> > > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > > > >> they have set in number of
> > threads,
> > > > > > compute
> > > > > > > a
> > > > > > > > > > > percent
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > then
> > > > > > > > > > > > > > > > > > come
> > > > > > > > > > > > > > > > > > > up
> > > > > > > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > > > > > > >> the number of thread units, and
> > > these
> > > > > will
> > > > > > > all
> > > > > > > > > be
> > > > > > > > > > > > wrong
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > > >> count changes. I also think this
> > > ties
> > > > us
> > > > > > to
> > > > > > > > > > > throttling
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > > >> pool,
> > > > > > > > > > > > > > > > > > > > > >> which may not be where we want
> to
> > > end
> > > > > up.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> 3. For what it's worth I do
> think
> > > > > having a
> > > > > > > > > single
> > > > > > > > > > > > > > > throttle_ms
> > > > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > > > > >> the responses that combines all
> > > > > throttling
> > > > > > > > from
> > > > > > > > > > all
> > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> simplest. There could be a use
> > case
> > > > for
> > > > > > > having
> > > > > > > > > > > > separate
> > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > each,
> > > > > > > > > > > > > > > > > > > > > >> but I think that is actually
> > harder
> > > to
> > > > > > > > > use/monitor
> > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > common
> > > > > > > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > > > > >> unless someone has a use case I
> > > think
> > > > > just
> > > > > > > one
> > > > > > > > > > > should
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > fine.
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM,
> > > > Rajini
> > > > > > > > Sivaram
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > >> > I have updated the KIP based
> on
> > > the
> > > > > > > > > discussions
> > > > > > > > > > so
> > > > > > > > > > > > > far.
> > > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > > >> > Regards,
> > > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > > >> > Rajini
> > > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29
> > PM,
> > > > > Rajini
> > > > > > > > > > Sivaram <
> > > > > > > > > > > > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > > >> > > Thank you all for the
> > feedback.
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > > Ismael #1. It makes sense
> not
> > to
> > > > > > > throttle
> > > > > > > > > > > > > inter-broker
> > > > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > > > >> > > LeaderAndIsr etc. The
> simplest
> > > way
> > > > > to
> > > > > > > > ensure
> > > > > > > > > > > that
> > > > > > > > > > > > > > > clients
> > > > > > > > > > > > > > > > > > cannot
> > > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > >> > these
> > > > > > > > > > > > > > > > > > > > > >> > > requests to bypass quotas
> for
> > > DoS
> > > > > > > attacks
> > > > > > > > is
> > > > > > > > > > to
> > > > > > > > > > > > > ensure
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > ACLs
> > > > > > > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > > > > > > >> > > clients from using these
> > > requests
> > > > > and
> > > > > > > > > > > unauthorized
> > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > >> included
> > > > > > > > > > > > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was
> > > thinking
> > > > > > that
> > > > > > > > > these
> > > > > > > > > > > > quotas
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > >> > separate
> > > > > > > > > > > > > > > > > > > > > >> > > throttle time, and all
> > > utilization
> > > > > > based
> > > > > > > > > > quotas
> > > > > > > > > > > > > could
> > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > > > > > > >> > > (we won't add another one
> for
> > > > > network
> > > > > > > > thread
> > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> instance).
> > > > > > > > > > > > > > > > > > > > > >> > But
> > > > > > > > > > > > > > > > > > > > > >> > > perhaps it makes sense to
> keep
> > > > byte
> > > > > > rate
> > > > > > > > > > quotas
> > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > > > > > > > > > > > >> > > responses to provide
> separate
> > > > > metrics?
> > > > > > > > Agree
> > > > > > > > > > > with
> > > > > > > > > > > > > > Ismael
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> name of
> > > > > > > > > > > > > > > > > > > > > >> > > the existing field should be
> > > > changed
> > > > > > if
> > > > > > > we
> > > > > > > > > > have
> > > > > > > > > > > > two.
> > > > > > > > > > > > > > > Happy
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > switch
> > > > > > > > > > > > > > > > > > > > > >> to a
> > > > > > > > > > > > > > > > > > > > > >> > > single combined throttle
> time
> > if
> > > > > that
> > > > > > is
> > > > > > > > > > > > sufficient.
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > > Ismael #4, #5, #6: Will
> update
> > > > KIP.
> > > > > > Will
> > > > > > > > use
> > > > > > > > > > dot
> > > > > > > > > > > > > > > separated
> > > > > > > > > > > > > > > > > > name
> > > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> new
> > > > > > > > > > > > > > > > > > > > > >> > > property. Replication quotas
> > use
> > > > dot
> > > > > > > > > > separated,
> > > > > > > > > > > so
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> consistent
> > > > > > > > > > > > > > > > > > > > > >> > > with all properties except
> > byte
> > > > rate
> > > > > > > > quotas.
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > > Radai: #1 Request processing
> > > time
> > > > > > rather
> > > > > > > > > than
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > > > > > > chosen
> > > > > > > > > > > > > > > > > > > > > >> > > because the time per request
> > can
> > > > > vary
> > > > > > > > > > > > significantly
> > > > > > > > > > > > > > > > between
> > > > > > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > > > > >> > > mentioned in the discussion
> > and
> > > > KIP.
> > > > > > > > > > > > > > > > > > > > > >> > > #2 Two separate quotas for
> > > > > > > > > heartbeats/regular
> > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > feel
> > > > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > >> > > configuration and more
> > metrics.
> > > > > Since
> > > > > > > most
> > > > > > > > > > users
> > > > > > > > > > > > > would
> > > > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > > > > > >> higher
> > > > > > > > > > > > > > > > > > > > > >> > > than the expected usage and
> > > quotas
> > > > > are
> > > > > > > > more
> > > > > > > > > > of a
> > > > > > > > > > > > > > safety
> > > > > > > > > > > > > > > > > net, a
> > > > > > > > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > > > > > > > > > > > >> > >  #3 The number of requests
> in
> > > > > > purgatory
> > > > > > > is
> > > > > > > > > > > limited
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > number
> > > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> > active
> > > > > > > > > > > > > > > > > > > > > >> > > connections since only one
> > > request
> > > > > per
> > > > > > > > > > > connection
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > throttled
> > > > > > > > > > > > > > > > > > > > > >> at a
> > > > > > > > > > > > > > > > > > > > > >> > > time.
> > > > > > > > > > > > > > > > > > > > > >> > > #4 As with byte rate quotas,
> > to
> > > > use
> > > > > > the
> > > > > > > > full
> > > > > > > > > > > > > allocated
> > > > > > > > > > > > > > > > > quotas,
> > > > > > > > > > > > > > > > > > > > > >> > > clients/users would need to
> > use
> > > > > > > partitions
> > > > > > > > > > that
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > distributed
> > > > > > > > > > > > > > > > > > > > > across
> > > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > > cluster. The alternative of
> > > using
> > > > > > > > > cluster-wide
> > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > instead
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> > per-broker
> > > > > > > > > > > > > > > > > > > > > >> > > quotas would be far too
> > complex
> > > to
> > > > > > > > > implement.
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > > Dong : We currently have two
> > > > > > > > > > ClientQuotaManagers
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > types
> > > > > > > > > > > > > > > > > > > > > Fetch
> > > > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > > > >> > > Produce. A new one will be
> > added
> > > > for
> > > > > > > > > IOThread,
> > > > > > > > > > > > which
> > > > > > > > > > > > > > > > manages
> > > > > > > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > > > > > > >> > I/O
> > > > > > > > > > > > > > > > > > > > > >> > > thread utilization. This
> will
> > > not
> > > > > > update
> > > > > > > > the
> > > > > > > > > > > Fetch
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > Produce
> > > > > > > > > > > > > > > > > > > > > >> queue-size,
> > > > > > > > > > > > > > > > > > > > > >> > > but will have a separate
> > metric
> > > > for
> > > > > > the
> > > > > > > > > > > > > queue-size.  I
> > > > > > > > > > > > > > > > > wasn't
> > > > > > > > > > > > > > > > > > > > > >> planning to
> > > > > > > > > > > > > > > > > > > > > >> > > add any additional metrics
> > apart
> > > > > from
> > > > > > > the
> > > > > > > > > > > > equivalent
> > > > > > > > > > > > > > > ones
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > > > >> > > quotas as part of this KIP.
> > > Ratio
> > > > of
> > > > > > > > > byte-rate
> > > > > > > > > > > to
> > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > > >> utilization
> > > > > > > > > > > > > > > > > > > > > >> > > could be slightly misleading
> > > since
> > > > > it
> > > > > > > > > depends
> > > > > > > > > > on
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > sequence
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> > requests.
> > > > > > > > > > > > > > > > > > > > > >> > > But we can look into more
> > > metrics
> > > > > > after
> > > > > > > > the
> > > > > > > > > > KIP
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > > implemented
> > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > > >> > required.
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > > I think we need to limit the
> > > > maximum
> > > > > > > delay
> > > > > > > > > > since
> > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > >> > > throttled. If a client has a
> > > quota
> > > > > of
> > > > > > > > 0.001
> > > > > > > > > > > units
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > > >> > used
> > > > > > > > > > > > > > > > > > > > > >> > > 50ms, we don't want to delay
> > all
> > > > > > > requests
> > > > > > > > > from
> > > > > > > > > > > the
> > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > > 50
> > > > > > > > > > > > > > > > > > > > > >> seconds,
> > > > > > > > > > > > > > > > > > > > > >> > > throwing the client out of
> all
> > > its
> > > > > > > > consumer
> > > > > > > > > > > > groups.
> > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > issue
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > > >> if
> > > > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > > > >> > > user is allocated a quota
> that
> > > is
> > > > > > > > > insufficient
> > > > > > > > > > > to
> > > > > > > > > > > > > > > process
> > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > > > large
> > > > > > > > > > > > > > > > > > > > > >> > > request. The expectation is
> > that
> > > > the
> > > > > > > units
> > > > > > > > > > > > allocated
> > > > > > > > > > > > > > per
> > > > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> > much
> > > > > > > > > > > > > > > > > > > > > >> > > higher than the time taken
> to
> > > > > process
> > > > > > > one
> > > > > > > > > > > request
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > > > >> > > seldom be applied. Agree
> this
> > > > needs
> > > > > > > proper
> > > > > > > > > > > > > > > documentation.
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04
> > PM,
> > > > > > radai <
> > > > > > > > > > > > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> @jun: i wasnt concerned
> about
> > > > tying
> > > > > > up
> > > > > > > a
> > > > > > > > > > > request
> > > > > > > > > > > > > > > > processing
> > > > > > > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > > > > > >> but
> > > > > > > > > > > > > > > > > > > > > >> > >> IIUC the code does still
> read
> > > the
> > > > > > > entire
> > > > > > > > > > > request
> > > > > > > > > > > > > out,
> > > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > > might
> > > > > > > > > > > > > > > > > > > > > >> add-up
> > > > > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > > > > >> > >> a non-negligible amount of
> > > > memory.
> > > > > > > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at
> 11:55
> > > AM,
> > > > > > Dong
> > > > > > > > Lin
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > The current KIP says that
> > the
> > > > > > maximum
> > > > > > > > > delay
> > > > > > > > > > > > will
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > reduced
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> window
> > > > > > > > > > > > > > > > > > > > > >> > >> size
> > > > > > > > > > > > > > > > > > > > > >> > >> > if it is larger than the
> > > window
> > > > > > > size. I
> > > > > > > > > > have
> > > > > > > > > > > a
> > > > > > > > > > > > > > > concern
> > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > this:
> > > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > 1) This essentially means
> > > that
> > > > > the
> > > > > > > user
> > > > > > > > > is
> > > > > > > > > > > > > allowed
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > exceed
> > > > > > > > > > > > > > > > > > > > > their
> > > > > > > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > over a long period of
> time.
> > > Can
> > > > > you
> > > > > > > > > provide
> > > > > > > > > > > an
> > > > > > > > > > > > > > upper
> > > > > > > > > > > > > > > > > bound
> > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > 2) What is the motivation
> > for
> > > > cap
> > > > > > the
> > > > > > > > > > maximum
> > > > > > > > > > > > > delay
> > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > window
> > > > > > > > > > > > > > > > > > > > > >> > size?
> > > > > > > > > > > > > > > > > > > > > >> > >> I
> > > > > > > > > > > > > > > > > > > > > >> > >> > am wondering if there is
> > > better
> > > > > > > > > alternative
> > > > > > > > > > > to
> > > > > > > > > > > > > > > address
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > 3) It means that the
> > existing
> > > > > > > > > > metric-related
> > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > > > > > > >> > >> > directly impact on the
> > > > mechanism
> > > > > of
> > > > > > > > this
> > > > > > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > > > > > > >> quota.
> > > > > > > > > > > > > > > > > > > > > >> > The
> > > > > > > > > > > > > > > > > > > > > >> > >> > may be an important
> change
> > > > > > depending
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > > > > answer
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > 1)
> > > > > > > > > > > > > > > > > > > above.
> > > > > > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > > > >> > >> probably
> > > > > > > > > > > > > > > > > > > > > >> > >> > need to document this
> more
> > > > > > > explicitly.
> > > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at
> > 10:56
> > > > AM,
> > > > > > > Dong
> > > > > > > > > Lin
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > > Yeah you are right. I
> > > thought
> > > > > it
> > > > > > > > wasn't
> > > > > > > > > > > > because
> > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > > LinkedIn
> > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > >> will
> > > > > > > > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > > > > > > >> > >> > > much pressure on
> inGraph
> > to
> > > > > > expose
> > > > > > > > > those
> > > > > > > > > > > > > > > per-clientId
> > > > > > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > > > > > >> > >> ended
> > > > > > > > > > > > > > > > > > > > > >> > >> > > up printing them
> > > periodically
> > > > > to
> > > > > > > > local
> > > > > > > > > > log.
> > > > > > > > > > > > > Never
> > > > > > > > > > > > > > > > mind
> > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> not
> > > > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > > - I agree with Jay that
> > we
> > > > > > probably
> > > > > > > > > don't
> > > > > > > > > > > > want
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > add a
> > > > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > > every quota
> > ProduceResponse
> > > > or
> > > > > > > > > > > FetchResponse.
> > > > > > > > > > > > > Is
> > > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > > > >> use-case
> > > > > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > > > > >> > >> > > having separate
> > > throttle-time
> > > > > > > fields
> > > > > > > > > for
> > > > > > > > > > > > > > > > > byte-rate-quota
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-quota?
> You
> > > > > > probably
> > > > > > > > need
> > > > > > > > > > to
> > > > > > > > > > > > > > document
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > > > >> > interface
> > > > > > > > > > > > > > > > > > > > > >> > >> > > change if you plan to
> add
> > > new
> > > > > > field
> > > > > > > > in
> > > > > > > > > > any
> > > > > > > > > > > > > > request.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > > - I don't think
> IOThread
> > > > > belongs
> > > > > > to
> > > > > > > > > > > > quotaType.
> > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> types
> > > > > > > > > > > > > > > > > > > > > >> > >> > > (i.e.
> > > > > > > Produce/Fetch/LeaderReplicatio
> > > > > > > > > > > > > > > > > > n/FollowerReplication)
> > > > > > > > > > > > > > > > > > > > > >> identify
> > > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > > type of request that
> are
> > > > > > throttled,
> > > > > > > > not
> > > > > > > > > > the
> > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > > - If a request is
> > throttled
> > > > due
> > > > > > to
> > > > > > > > this
> > > > > > > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > > > > > > >> quota,
> > > > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > > existing queue-size
> > metric
> > > in
> > > > > > > > > > > > > ClientQuotaManager
> > > > > > > > > > > > > > > > > > > incremented?
> > > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > > - In the interest of
> > > > providing
> > > > > > > guide
> > > > > > > > > line
> > > > > > > > > > > for
> > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > decide
> > > > > > > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-based
> > quota
> > > > and
> > > > > > for
> > > > > > > > user
> > > > > > > > > > to
> > > > > > > > > > > > > > > understand
> > > > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > > > > > impact
> > > > > > > > > > > > > > > > > > > > > >> on
> > > > > > > > > > > > > > > > > > > > > >> > >> their
> > > > > > > > > > > > > > > > > > > > > >> > >> > > traffic, would it be
> > useful
> > > > to
> > > > > > > have a
> > > > > > > > > > > metric
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > shows
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> overall
> > > > > > > > > > > > > > > > > > > > > >> > >> > > byte-rate per
> > > io-thread-unit?
> > > > > Can
> > > > > > > we
> > > > > > > > > also
> > > > > > > > > > > > show
> > > > > > > > > > > > > > > this a
> > > > > > > > > > > > > > > > > > > > > >> per-clientId
> > > > > > > > > > > > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at
> > > 9:25
> > > > > AM,
> > > > > > > Jun
> > > > > > > > > Rao
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > > jun@confluent.io
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> For #3, typically, an
> > > admin
> > > > > > won't
> > > > > > > > > > > configure
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > > > than
> > > > > > > > > > > > > > > > > > > > > >> > CPU
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> but it's possible for
> an
> > > > admin
> > > > > > to
> > > > > > > > > start
> > > > > > > > > > > with
> > > > > > > > > > > > > > fewer
> > > > > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > > > > > > > > >> > >> cores
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> and grow that later
> on.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> I think the
> throttleTime
> > > > > sensor
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > > broker
> > > > > > > > > > > > > > > tells
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> user/clentId is
> > throttled
> > > or
> > > > > > not.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> The reasoning for
> > delaying
> > > > the
> > > > > > > > > throttled
> > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > > > >> > >> instead
> > > > > > > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> returning an error
> > > > immediately
> > > > > > is
> > > > > > > > that
> > > > > > > > > > the
> > > > > > > > > > > > > > latter
> > > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> client from retrying
> > > > > > immediately,
> > > > > > > > > which
> > > > > > > > > > > will
> > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > things
> > > > > > > > > > > > > > > > > > > > > worse.
> > > > > > > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> delaying logic is
> based
> > > off
> > > > a
> > > > > > > delay
> > > > > > > > > > > queue. A
> > > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > > > expiration
> > > > > > > > > > > > > > > > > > > > > >> > >> thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> just waits on the next
> > to
> > > be
> > > > > > > expired
> > > > > > > > > > > > request.
> > > > > > > > > > > > > > So,
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > > > > > tie
> > > > > > > > > > > > > > > > > > > > > >> > up a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> request handler
> thread.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017
> at
> > > 9:07
> > > > > AM,
> > > > > > > > > Ismael
> > > > > > > > > > > > Juma <
> > > > > > > > > > > > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > Regarding 1, I
> > > definitely
> > > > > like
> > > > > > > the
> > > > > > > > > > > > > simplicity
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > keeping a
> > > > > > > > > > > > > > > > > > > > > >> single
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > time field in the
> > > > response.
> > > > > > The
> > > > > > > > > > downside
> > > > > > > > > > > > is
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > > > > > > >> > metrics
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > > > > > > > > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > > > > > > > > > > > percentage`
> > > > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > `log.cleaner.min.cleanable.
> > > > > > > > ratio`.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017
> > at
> > > > 4:43
> > > > > > PM,
> > > > > > > > Jay
> > > > > > > > > > > > Kreps <
> > > > > > > > > > > > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > A few minor
> > comments:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    1. Isn't it the
> > > case
> > > > > that
> > > > > > > the
> > > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > > > > > > >> > >> > should
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    the total time
> > your
> > > > > > request
> > > > > > > > was
> > > > > > > > > > > > > throttled
> > > > > > > > > > > > > > > > > > > > irrespective
> > > > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    caused that.
> > > Limiting
> > > > > it
> > > > > > to
> > > > > > > > > byte
> > > > > > > > > > > rate
> > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    I don't think
> we
> > > want
> > > > > to
> > > > > > > end
> > > > > > > > up
> > > > > > > > > > > > adding
> > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> response
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    single thing we
> > > > quota,
> > > > > > > right?
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    2. I don't
> think
> > we
> > > > > > should
> > > > > > > > make
> > > > > > > > > > > this
> > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > > > specifically
> > > > > > > > > > > > > > > > > > > > > >> > about
> > > > > > > > > > > > > > > > > > > > > >> > >> io
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    threads. Once
> we
> > > > > > introduce
> > > > > > > > > these
> > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    be enforced
> (and
> > if
> > > > > they
> > > > > > > > aren't
> > > > > > > > > > it
> > > > > > > > > > > > may
> > > > > > > > > > > > > > > cause
> > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > outage).
> > > > > > > > > > > > > > > > > > > > > >> As
> > > > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    are a bit more
> > > > > sensitive
> > > > > > > than
> > > > > > > > > > > normal
> > > > > > > > > > > > > > > > configs, I
> > > > > > > > > > > > > > > > > > > > think.
> > > > > > > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > > > > > > >> > >> > current
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    pools seem like
> > > > > something
> > > > > > > of
> > > > > > > > an
> > > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > > > detail
> > > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    user-facing
> > quotas
> > > > > should
> > > > > > > be
> > > > > > > > > > > involved
> > > > > > > > > > > > > > > with. I
> > > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > >> might
> > > > > > > > > > > > > > > > > > > > > >> > >> be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    make this a
> > general
> > > > > > > > > request-time
> > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > > > > >> mention in
> > > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    about I/O
> threads
> > > and
> > > > > > > simply
> > > > > > > > > > > > > acknowledge
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    may someday
> fix)
> > in
> > > > the
> > > > > > > docs
> > > > > > > > > that
> > > > > > > > > > > > this
> > > > > > > > > > > > > > > covers
> > > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > > > > > >> > >> after
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    thread is read
> > off
> > > > the
> > > > > > > > network.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    3. As such I
> > think
> > > > the
> > > > > > > right
> > > > > > > > > > > > interface
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    like
> > > > > percent_request_time
> > > > > > > and
> > > > > > > > > be
> > > > > > > > > > in
> > > > > > > > > > > > > > > > {0,...100}
> > > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    in
> {0.0,...,1.0}
> > (I
> > > > > think
> > > > > > > > > "ratio"
> > > > > > > > > > > is
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > terminology
> > > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > > >> used
> > > > > > > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    is between 0
> and
> > 1
> > > in
> > > > > the
> > > > > > > > other
> > > > > > > > > > > > > metrics,
> > > > > > > > > > > > > > > > > right?)
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23,
> 2017
> > > at
> > > > > 3:45
> > > > > > > AM,
> > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > >> > >> > >>
> rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Thank you for
> the
> > > > > > feedback.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang : I
> have
> > > > > updated
> > > > > > > the
> > > > > > > > > > > section
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > co-existence
> > > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> byte
> > > > > > > > > > > > > > > > > > > > > >> > >> rate
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > request time
> > quotas.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't
> > added
> > > > > much
> > > > > > > > detail
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > >> sensors
> > > > > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > going to be very
> > > > similar
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > >> sensors.
> > > > > > > > > > > > > > > > > > > > > >> > >> To
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > confusion, I
> have
> > > now
> > > > > > added
> > > > > > > > more
> > > > > > > > > > > > detail.
> > > > > > > > > > > > > > All
> > > > > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > "quotaType" and
> > all
> > > > > > sensors
> > > > > > > > have
> > > > > > > > > > > names
> > > > > > > > > > > > > > > > starting
> > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > quotaType is
> > > > > > Produce/Fetch/
> > > > > > > > > > > > > > > LeaderReplication/
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > FollowerReplication/*IOThread*
> > > > > > > > > ).
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > So there will be
> > no
> > > > > reuse
> > > > > > of
> > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > metrics/sensors.
> > > > > > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > > > > >> > new
> > > > > > > > > > > > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > request
> processing
> > > > time
> > > > > > > based
> > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> completely
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > existing
> > > > > metrics/sensors,
> > > > > > > but
> > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > > > > > consistent
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > >> format.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > The existing
> > > > > > > throttle_time_ms
> > > > > > > > > > field
> > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > produce/fetch
> > > > > > > > > > > > > > > > > > > > > >> > responses
> > > > > > > > > > > > > > > > > > > > > >> > >> > will
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > impacted by this
> > > KIP.
> > > > > That
> > > > > > > > will
> > > > > > > > > > > > continue
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > > > > > >> byte-rate
> > > > > > > > > > > > > > > > > > > > > >> > >> based
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > throttling
> times.
> > In
> > > > > > > > addition, a
> > > > > > > > > > new
> > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > added to return
> > > > request
> > > > > > > quota
> > > > > > > > > > based
> > > > > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > > > > > times.
> > > > > > > > > > > > > > > > > > > > > >> These
> > > > > > > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > as new metrics
> on
> > > the
> > > > > > > > > client-side.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Since all
> metrics
> > > and
> > > > > > > sensors
> > > > > > > > > are
> > > > > > > > > > > > > > different
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > believe there is
> > > > already
> > > > > > > > > > sufficient
> > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > monitor
> > > > > > > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > client and
> broker
> > > side
> > > > > for
> > > > > > > > each
> > > > > > > > > > type
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > throttling.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23,
> > 2017
> > > > at
> > > > > > 4:32
> > > > > > > > AM,
> > > > > > > > > > > Dong
> > > > > > > > > > > > > Lin
> > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > I think it
> > makes a
> > > > lot
> > > > > > of
> > > > > > > > > sense
> > > > > > > > > > to
> > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > io_thread_units
> > > > > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > > > > >> > >> metric
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > user's traffic
> > > here.
> > > > > > LGTM
> > > > > > > > > > > overall. I
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > > questions
> > > > > > > > > > > > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Can you be
> > more
> > > > > > specific
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > KIP
> > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > sensors
> > > > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > example, it
> will
> > > be
> > > > > > useful
> > > > > > > > to
> > > > > > > > > > > > specify
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > name
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - We currently
> > > have
> > > > > > > > > > throttle-time
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > queue-size
> > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Are you going
> to
> > > > have
> > > > > > > > separate
> > > > > > > > > > > > > > > throttle-time
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > >> queue-size
> > > > > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > throttled by
> > > > > > > > > > io_thread_unit-based
> > > > > > > > > > > > > quota,
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > > > > >> share
> > > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Does the
> > > > > throttle-time
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > > > > > > ProduceResponse
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > time due to
> > > > > > > > > io_thread_unit-based
> > > > > > > > > > > > > quota?
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Currently
> > kafka
> > > > > server
> > > > > > > > > doesn't
> > > > > > > > > > > not
> > > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > log
> > > > > > > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > whether any
> > given
> > > > > > clientId
> > > > > > > > (or
> > > > > > > > > > > user)
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > throttled.
> > > > > > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > because we can
> > > still
> > > > > > check
> > > > > > > > the
> > > > > > > > > > > > > > client-side
> > > > > > > > > > > > > > > > > > > byte-rate
> > > > > > > > > > > > > > > > > > > > > >> metric
> > > > > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > whether a
> given
> > > > client
> > > > > > is
> > > > > > > > > > > throttled.
> > > > > > > > > > > > > But
> > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > will be no way
> > to
> > > > > > validate
> > > > > > > > > > > whether a
> > > > > > > > > > > > > > given
> > > > > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> slow
> > > > > > > > > > > > > > > > > > > > > >> > >> > because
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > exceeded its
> > > > > > > io_thread_unit
> > > > > > > > > > limit.
> > > > > > > > > > > > It
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > necessary
> > > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> user
> > > > > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > know this
> > > > information
> > > > > to
> > > > > > > > > figure
> > > > > > > > > > > how
> > > > > > > > > > > > > > > whether
> > > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > > >> > reached
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > limit. How
> about
> > > we
> > > > > add
> > > > > > > > log4j
> > > > > > > > > > log
> > > > > > > > > > > on
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > server
> > > > > > > > > > > > > > > > > > > side
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> (client_id,
> > > > > > > > > > > > > byte-rate-throttle-time,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > io-thread-unit-throttle-time)
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > that kafka
> > > > > administrator
> > > > > > > can
> > > > > > > > > > > figure
> > > > > > > > > > > > > > those
> > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> have
> > > > > > > > > > > > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > limit and act
> > > > > > accordingly?
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb
> 22,
> > > 2017
> > > > > at
> > > > > > > 4:46
> > > > > > > > > PM,
> > > > > > > > > > > > > > Guozhang
> > > > > > > > > > > > > > > > > Wang <
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Made a pass
> > over
> > > > the
> > > > > > > doc,
> > > > > > > > > > > overall
> > > > > > > > > > > > > LGTM
> > > > > > > > > > > > > > > > > except
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > minor
> > > > > > > > > > > > > > > > > > > > > >> > >> comment
> > > > > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttling
> > > > > > > implementation:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Stated as
> > > "Request
> > > > > > > > > processing
> > > > > > > > > > > time
> > > > > > > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > necessary."
> I
> > > > > thought
> > > > > > > that
> > > > > > > > > it
> > > > > > > > > > > > meant
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > > >> > processing
> > > > > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > is applied
> > > first,
> > > > > but
> > > > > > > > > continue
> > > > > > > > > > > > > > reading I
> > > > > > > > > > > > > > > > > found
> > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > >> > actually
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > produce /
> > fetch
> > > > byte
> > > > > > > rate
> > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Also the
> last
> > > > > sentence
> > > > > > > > "The
> > > > > > > > > > > > > remaining
> > > > > > > > > > > > > > > > delay
> > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > response."
> is
> > a
> > > > bit
> > > > > > > > > confusing
> > > > > > > > > > to
> > > > > > > > > > > > me.
> > > > > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > > > > rewording
> > > > > > > > > > > > > > > > > > > > > >> it a
> > > > > > > > > > > > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb
> > 22,
> > > > 2017
> > > > > > at
> > > > > > > > 3:24
> > > > > > > > > > PM,
> > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Hi,
> Rajini,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Thanks for
> > the
> > > > > > updated
> > > > > > > > > KIP.
> > > > > > > > > > > The
> > > > > > > > > > > > > > latest
> > > > > > > > > > > > > > > > > > > proposal
> > > > > > > > > > > > > > > > > > > > > >> looks
> > > > > > > > > > > > > > > > > > > > > >> > >> good
> > > > > > > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > On Wed,
> Feb
> > > 22,
> > > > > 2017
> > > > > > > at
> > > > > > > > > 2:19
> > > > > > > > > > > PM,
> > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> Jun/Roger,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Thank
> you
> > > for
> > > > > the
> > > > > > > > > > feedback.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 1. I
> have
> > > > > updated
> > > > > > > the
> > > > > > > > > KIP
> > > > > > > > > > to
> > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > absolute
> > > > > > > > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> > is
> > > > > > called*
> > > > > > > > > > > > > > io_thread_units*
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > align
> > > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> > > > > > > > > *num.io.threads*.
> > > > > > > > > > > > When
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > implement
> > > > > > > > > > > > > > > > > > > > > >> network
> > > > > > > > > > > > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas,
> we
> > > can
> > > > > add
> > > > > > > > > another
> > > > > > > > > > > > > > property
> > > > > > > > > > > > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 2.
> > > > > > > ControlledShutdown
> > > > > > > > is
> > > > > > > > > > > > already
> > > > > > > > > > > > > > > > listed
> > > > > > > > > > > > > > > > > > > under
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > you
> mean a
> > > > > > different
> > > > > > > > > > request
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> added?
> > > > > > > > > > > > > > > > > > > > > >> > >> The
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> currently
> > > > exempt
> > > > > > in
> > > > > > > > the
> > > > > > > > > > KIP
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > StopReplica,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > LeaderAndIsr
> > > > and
> > > > > > > > > > > > UpdateMetadata.
> > > > > > > > > > > > > > > These
> > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > >> controlled
> > > > > > > > > > > > > > > > > > > > > >> > >> > using
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> ClusterAction
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so
> it
> > > is
> > > > > easy
> > > > > > > to
> > > > > > > > > > > exclude
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > sure if
> > > there
> > > > > are
> > > > > > > > other
> > > > > > > > > > > > requests
> > > > > > > > > > > > > > > used
> > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > be
> > excluded.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was
> > > > > thinking
> > > > > > > the
> > > > > > > > > > > smallest
> > > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> replace
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > *requestChannel.sendResponse()
> > > > > > > > > > > > *
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > local
> > > > > > > > > > > > > > > > > > > > > >> method
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > *sendResponseMaybeThrottle()*
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > throttling
> > > > > > > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> response.
> > If
> > > > we
> > > > > > > > throttle
> > > > > > > > > > > first
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > within
> the
> > > > > method
> > > > > > > > > handling
> > > > > > > > > > > the
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > throttling.
> > > We
> > > > > can
> > > > > > > > look
> > > > > > > > > > into
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > again
> > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> PR
> > > > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > On Wed,
> > Feb
> > > > 22,
> > > > > > 2017
> > > > > > > > at
> > > > > > > > > > 5:55
> > > > > > > > > > > > PM,
> > > > > > > > > > > > > > > Roger
> > > > > > > > > > > > > > > > > > > Hoover
> > > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > roger.hoover@gmail.com>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Great
> to
> > > see
> > > > > > this
> > > > > > > > KIP
> > > > > > > > > > and
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > excellent
> > > > > > > > > > > > > > > > > > > > > >> discussion.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > To me,
> > > Jun's
> > > > > > > > > suggestion
> > > > > > > > > > > > makes
> > > > > > > > > > > > > > > sense.
> > > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > > > my
> > > > > > > > > > > > > > > > > > > > > >> > >> application
> > > > > > > > > > > > > > > > > > > > > >> > >> > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> request
> > > > > handler
> > > > > > > > unit,
> > > > > > > > > > then
> > > > > > > > > > > > > it's
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > > > > > > >> > Kafka
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> request
> > > > > handler
> > > > > > > > thread
> > > > > > > > > > > > > dedicated
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > me.
> > > > > > > > > > > > > > > > > > > > > That's
> > > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > least.
> > > That
> > > > > > > > > allocation
> > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > > > > even
> > > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > > >> an
> > > > > > > > > > > > > > > > > > > > > >> > >> admin
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > size
> of
> > > the
> > > > > > > request
> > > > > > > > > > thread
> > > > > > > > > > > > > pool
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > broker.
> > > > > > > > > > > > > > > > > > > > > >> > It's
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > abstraction
> > > > > that
> > > > > > > VMs
> > > > > > > > > and
> > > > > > > > > > > > > > > containers
> > > > > > > > > > > > > > > > > get
> > > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > schedulers.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > While
> > > > > different
> > > > > > > > client
> > > > > > > > > > > > access
> > > > > > > > > > > > > > > > patterns
> > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > >> > wildly
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> request
> > > > thread
> > > > > > > > > resources
> > > > > > > > > > > per
> > > > > > > > > > > > > > > > request,
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > given
> > > > > > > > > > > > > > > > > > > > > >> > >> > application
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > have a
> > > > stable
> > > > > > > access
> > > > > > > > > > > pattern
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> "request
> > > > > thread
> > > > > > > > units"
> > > > > > > > > > it
> > > > > > > > > > > > > needs
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > meet
> > > > > > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> Cheers,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > On
> Wed,
> > > Feb
> > > > > 22,
> > > > > > > 2017
> > > > > > > > > at
> > > > > > > > > > > 8:53
> > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi,
> > > > Rajini,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> Thanks
> > > for
> > > > > the
> > > > > > > > > updated
> > > > > > > > > > > > KIP.
> > > > > > > > > > > > > A
> > > > > > > > > > > > > > > few
> > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > >> comments.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A
> > > > concern
> > > > > > of
> > > > > > > > > > > > > > > > request_time_percent
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> it's
> > > > > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > > > > >> > >> > an
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> Let's
> > > say
> > > > > you
> > > > > > > > give a
> > > > > > > > > > > user
> > > > > > > > > > > > a
> > > > > > > > > > > > > > 10%
> > > > > > > > > > > > > > > > > limit.
> > > > > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > admin
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > request
> > > > > > handler
> > > > > > > > > > threads,
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > now
> > > > > > > > > > > > > > > > > > > > > >> actually
> > > > > > > > > > > > > > > > > > > > > >> > has
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > capacity.
> > > > > This
> > > > > > > may
> > > > > > > > > > > confuse
> > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > bit.
> > > > > > > > > > > > > > > > > > > > So,
> > > > > > > > > > > > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> based
> > on
> > > > an
> > > > > > > > absolute
> > > > > > > > > > > > request
> > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > unit
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > better.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > > > > > > > > > > > ControlledShutdownRequest
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > also
> > > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > excluded
> > > > > > from
> > > > > > > > > > > > throttling.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 3.
> > > > > > > Implementation
> > > > > > > > > > wise,
> > > > > > > > > > > I
> > > > > > > > > > > > am
> > > > > > > > > > > > > > > > > wondering
> > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > time
> > > > > > throttling
> > > > > > > > > first
> > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > throttling
> > > > > > > > logic
> > > > > > > > > > in
> > > > > > > > > > > > each
> > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > request.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > Thanks,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > On
> > Wed,
> > > > Feb
> > > > > > 22,
> > > > > > > > 2017
> > > > > > > > > > at
> > > > > > > > > > > > 5:58
> > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> Jun,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > Thank
> > > > you
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > > review.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I
> > have
> > > > > > > reverted
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > > > > original
> > > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > handler
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > utilization.
> > > > > > > At
> > > > > > > > > the
> > > > > > > > > > > > > moment,
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > uses
> > > > > > > > > > > > > > > > > > > > > >> percentage,
> > > > > > > > > > > > > > > > > > > > > >> > >> but
> > > > > > > > > > > > > > > > > > > > > >> > >> > I
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > a
> > > > fraction
> > > > > > > (out
> > > > > > > > > of 1
> > > > > > > > > > > > > instead
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > 100)
> > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > examples
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> from
> > > > this
> > > > > > > > > discussion
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > KIP.
> > > > > > > > > > > > > > > > > > Also
> > > > > > > > > > > > > > > > > > > > > added
> > > > > > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > address
> > > > > > > network
> > > > > > > > > > thread
> > > > > > > > > > > > > > > > > utilization.
> > > > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > "request_time_percent"
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > expectation
> > > > > > > > > > > > > > > > > > > > > >> > that
> > > > > > > > > > > > > > > > > > > > > >> > >> it
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > limit
> > > > for
> > > > > > > > network
> > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > users
> > > > have
> > > > > > to
> > > > > > > > set
> > > > > > > > > > only
> > > > > > > > > > > > one
> > > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > not
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> the
> > > > > internal
> > > > > > > > > > > > distribution
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > work
> > > > > > > > > > > > > > > > > > > > > >> between
> > > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > two
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > Kafka.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > Regards,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > Rajini
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > On
> > > Wed,
> > > > > Feb
> > > > > > > 22,
> > > > > > > > > 2017
> > > > > > > > > > > at
> > > > > > > > > > > > > > 12:23
> > > > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > > Rao
> > > > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > Hi,
> > > > > > Rajini,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > Thanks
> > > > > for
> > > > > > > the
> > > > > > > > > > > > proposal.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > The
> > > > > > benefit
> > > > > > > of
> > > > > > > > > > using
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > > >> processing
> > > > > > > > > > > > > > > > > > > > > >> > >> time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > exactly
> > > > > > what
> > > > > > > > > > people
> > > > > > > > > > > > have
> > > > > > > > > > > > > > > > said. I
> > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > > > > > >> > >> expand
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> Consider
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > following
> > > > > > > > case.
> > > > > > > > > > The
> > > > > > > > > > > > > > producer
> > > > > > > > > > > > > > > > > > sends a
> > > > > > > > > > > > > > > > > > > > > >> produce
> > > > > > > > > > > > > > > > > > > > > >> > >> > request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > message
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > but
> > > > > > > compressed
> > > > > > > > > to
> > > > > > > > > > > > 100KB
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > gzip.
> > > > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > broker
> > > > > > could
> > > > > > > > > take
> > > > > > > > > > > > 10-15
> > > > > > > > > > > > > > > > seconds,
> > > > > > > > > > > > > > > > > > > > during
> > > > > > > > > > > > > > > > > > > > > >> which
> > > > > > > > > > > > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > thread
> > > > > is
> > > > > > > > > > completely
> > > > > > > > > > > > > > > blocked.
> > > > > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > >> case,
> > > > > > > > > > > > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > the
> > > > > > request
> > > > > > > > rate
> > > > > > > > > > > quota
> > > > > > > > > > > > > may
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > Consider
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > another
> > > > > > > case.
> > > > > > > > A
> > > > > > > > > > > > consumer
> > > > > > > > > > > > > > > group
> > > > > > > > > > > > > > > > > > > starts
> > > > > > > > > > > > > > > > > > > > > >> with 10
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > switches
> > > > > > to
> > > > > > > 20
> > > > > > > > > > > > > instances.
> > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > > > > > >> > will
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > actually
> > > > > > > load
> > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > > > double
> > > > > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > contains
> > > > > > > half
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > > > > partitions.
> > > > > > > > > > > > > > > > > > > > Request
> > > > > > > > > > > > > > > > > > > > > >> rate
> > > > > > > > > > > > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > configure
> > > > > > in
> > > > > > > > > this
> > > > > > > > > > > > case.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > What
> > > > we
> > > > > > > really
> > > > > > > > > > want
> > > > > > > > > > > is
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> of
> > > the
> > > > > > > server
> > > > > > > > > side
> > > > > > > > > > > > > > > resources.
> > > > > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > >> > >> particular
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > capacity
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > > request
> > > > > > > > > > > > > > > > handler
> > > > > > > > > > > > > > > > > > > > > threads. I
> > > > > > > > > > > > > > > > > > > > > >> > >> agree
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > intuitive
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > > users
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > determine
> > > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > > > >> > set
> > > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > However,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > this
> > > > is
> > > > > > not
> > > > > > > > > > > completely
> > > > > > > > > > > > > new
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > > > > been
> > > > > > > > > > > > > > > > > > > > > >> done
> > > > > > > > > > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > already.
> > > > > > For
> > > > > > > > > > > example,
> > > > > > > > > > > > > > Linux
> > > > > > > > > > > > > > > > > > cgroup (
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > https://access.redhat.com/
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > Resource_Management_Guide/sec-
> > > > > > > > > > > > > > > > > > > > cpu.html)
> > > > > > > > > > > > > > > > > > > > > >> has
> > > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > which
> > > > > > > > specifies
> > > > > > > > > > the
> > > > > > > > > > > > > total
> > > > > > > > > > > > > > > > amount
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > tasks
> > > > > in a
> > > > > > > > > cgroup
> > > > > > > > > > > can
> > > > > > > > > > > > > run
> > > > > > > > > > > > > > > > > during a
> > > > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > > > > >> second
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > potentially
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > model
> > > > > the
> > > > > > > > > request
> > > > > > > > > > > > > handler
> > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > > > > > > > >> > similar
> > > > > > > > > > > > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > request
> > > > > > > > handler
> > > > > > > > > > > thread
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > 1
> > > > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > > >> > handler
> > > > > > > > > > > > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > configure
> > > > > > a
> > > > > > > > > limit
> > > > > > > > > > on
> > > > > > > > > > > > how
> > > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > > > > > (say
> > > > > > > > > > > > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > > > > > > > > > > > >> > >> a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > Regarding
> > > > > > > not
> > > > > > > > > > > > throttling
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > internal
> > > > > > > > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> could
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> do
> > > > that.
> > > > > > > > > > > > Alternatively,
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > > > > > let
> > > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > limit
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > for
> > > > the
> > > > > > > kafka
> > > > > > > > > user
> > > > > > > > > > > (it
> > > > > > > > > > > > > may
> > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> do
> > > > > > > > > > > > > > > > > > > > > >> > >> that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > clientId
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > though).
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > Ideally
> > > > > we
> > > > > > > > want
> > > > > > > > > to
> > > > > > > > > > > be
> > > > > > > > > > > > > able
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > protect
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > pool
> > > > > too.
> > > > > > > The
> > > > > > > > > > > > difficult
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > > > >> Rajini
> > > > > > > > > > > > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> mechanism
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > throttling
> > > > > > > the
> > > > > > > > > > > > requests
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > through
> > > > > > > > > > > > > > > > > > > > > >> Purgatory
> > > > > > > > > > > > > > > > > > > > > >> > >> and
> > > > > > > > > > > > > > > > > > > > > >> > >> > we
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > through
> > > > > > how
> > > > > > > to
> > > > > > > > > > > > integrate
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > network
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > network
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > layer,
> > > > > > > > currently
> > > > > > > > > > we
> > > > > > > > > > > > know
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > user,
> > > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> request.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> So,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > it's a
> > > > > bit
> > > > > > > > > tricky
> > > > > > > > > > to
> > > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > > > based
> > > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > >> > clientId
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > byteOut
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > quota
> > > > > can
> > > > > > > > > already
> > > > > > > > > > > > > protect
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > > > >> thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > requests.
> > > > > > > So,
> > > > > > > > if
> > > > > > > > > > we
> > > > > > > > > > > > > can't
> > > > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > >> > part
> > > > > > > > > > > > > > > > > > > > > >> > >> > right
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> focusing
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > the
> > > > > > request
> > > > > > > > > > handling
> > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > Jun
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> On
> > > > Tue,
> > > > > > Feb
> > > > > > > > 21,
> > > > > > > > > > 2017
> > > > > > > > > > > > at
> > > > > > > > > > > > > > 4:27
> > > > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > Thank
> > > > > > you
> > > > > > > > all
> > > > > > > > > > for
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > feedback.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > Jay: I
> > > > > > > have
> > > > > > > > > > > removed
> > > > > > > > > > > > > > > > exemption
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> consumer
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > protecting
> > > > > > > > the
> > > > > > > > > > > > cluster
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > important
> > > > > > > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> individual
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > apps.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > Have
> > > > > > > > retained
> > > > > > > > > > the
> > > > > > > > > > > > > > > exemption
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > StopReplicat/LeaderAndIsr
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> are
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > throttled
> > > > > > > > only
> > > > > > > > > > if
> > > > > > > > > > > > > > > > > authorization
> > > > > > > > > > > > > > > > > > > > fails
> > > > > > > > > > > > > > > > > > > > > >> (so
> > > > > > > > > > > > > > > > > > > > > >> > >> can't
> > > > > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > attacks
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> in
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > a
> > > > > secure
> > > > > > > > > > cluster,
> > > > > > > > > > > > but
> > > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > > > > >> inter-broker
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > without
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > delays).
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > I
> > > > will
> > > > > > > wait
> > > > > > > > > > > another
> > > > > > > > > > > > > day
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > > these
> > > > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > > > >> > >> any
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > based
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > request
> > > > > > > > > > processing
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > (as
> > > > > > > > > > > > > > > > > > > opposed
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > objections,
> > > > > > > > I
> > > > > > > > > > will
> > > > > > > > > > > > > > revert
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> original
> > > > > > > > > > > > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> changes.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > The
> > > > > > > original
> > > > > > > > > > > > proposal
> > > > > > > > > > > > > > was
> > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > > including
> > > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > handler
> > > > > > > > > threads
> > > > > > > > > > > > (that
> > > > > > > > > > > > > > made
> > > > > > > > > > > > > > > > > > > > calculation
> > > > > > > > > > > > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > suggestion
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > include
> > > > > > > the
> > > > > > > > > time
> > > > > > > > > > > > spent
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > significant.
> > > > > > > > > As
> > > > > > > > > > > Jay
> > > > > > > > > > > > > > > pointed
> > > > > > > > > > > > > > > > > out,
> > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > calculate
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > total
> > > > > > > > > available
> > > > > > > > > > > CPU
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > convert
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > threads
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > and
> > > > > *n*
> > > > > > > > > network
> > > > > > > > > > > > > threads.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > what
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > we
> > > > > want,
> > > > > > > but
> > > > > > > > > it
> > > > > > > > > > > can
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > > > expensive
> > > > > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > >> > some
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > Guozhang
> > > > > > > > have
> > > > > > > > > > > > pointed
> > > > > > > > > > > > > > out,
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > > >> > several
> > > > > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > already
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > generating
> > > > > > > > > > metrics
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > use,
> > > > > > > > > > > > > > > > > > > > > >> > though
> > > > > > > > > > > > > > > > > > > > > >> > >> we
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > nanoTime()
> > > > > > > > > > instead
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > small
> > > > > > > > requests
> > > > > > > > > > may
> > > > > > > > > > > > be
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > > 1ms.
> > > > > > > > > > > > > > > > > But
> > > > > > > > > > > > > > > > > > > > > rather
> > > > > > > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > > > > > > >> > >> add
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> I/O
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > thread
> > > > > > and
> > > > > > > > > > network
> > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > > > wouldn't
> > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > > > >> > >> better
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > spent
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > on
> > > > > each
> > > > > > > > thread
> > > > > > > > > > > into
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > > ratio?
> > > > > > > > > > > > > > > > > > > > > >> UserA
> > > > > > > > > > > > > > > > > > > > > >> > >> has
> > > > > > > > > > > > > > > > > > > > > >> > >> > a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > Can
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > we
> > > > > take
> > > > > > > that
> > > > > > > > > to
> > > > > > > > > > > mean
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > UserA
> > > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > > >> 5%
> > > > > > > > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > threads
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > and
> > > > 5%
> > > > > > of
> > > > > > > > the
> > > > > > > > > > time
> > > > > > > > > > > > on
> > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > threads?
> > > > > > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > > > > > >> > either
> > > > > > > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > response
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > throttled
> > > > > > > -
> > > > > > > > it
> > > > > > > > > > > would
> > > > > > > > > > > > > > mean
> > > > > > > > > > > > > > > > > > > > maintaining
> > > > > > > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > > > > > > >> > >> sets
> > > > > > > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > durations,
> > > > > > > > but
> > > > > > > > > > > would
> > > > > > > > > > > > > > > result
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> define
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > two
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > quota
> > > > > > > limits
> > > > > > > > > > > (UserA
> > > > > > > > > > > > > has
> > > > > > > > > > > > > > 5%
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > > >> > threads
> > > > > > > > > > > > > > > > > > > > > >> > >> > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > threads),
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > but
> > > > > that
> > > > > > > > seems
> > > > > > > > > > > > > > unnecessary
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > harder
> > > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> explain
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > Back
> > > > > to
> > > > > > > why
> > > > > > > > > and
> > > > > > > > > > > how
> > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > applied
> > > > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > > > >> > >> > network
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > utilization:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > a)
> > > > In
> > > > > > the
> > > > > > > > case
> > > > > > > > > > of
> > > > > > > > > > > > > fetch,
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > > > >> spent in
> > > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > significant
> > > > > > > > > and
> > > > > > > > > > I
> > > > > > > > > > > > can
> > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > include
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > requests
> > > > > > > > where
> > > > > > > > > > the
> > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> case
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > of
> > > > > > fetch,
> > > > > > > > > > request
> > > > > > > > > > > > > > handler
> > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > > >> > utilization
> > > > > > > > > > > > > > > > > > > > > >> > >> > would
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> with
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > high
> > > > > > > request
> > > > > > > > > > rate,
> > > > > > > > > > > > low
> > > > > > > > > > > > > > > data
> > > > > > > > > > > > > > > > > > volume
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > >> > fetch
> > > > > > > > > > > > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > throttle
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > clients
> > > > > > > with
> > > > > > > > > > high
> > > > > > > > > > > > data
> > > > > > > > > > > > > > > > volume.
> > > > > > > > > > > > > > > > > > > > Network
> > > > > > > > > > > > > > > > > > > > > >> > thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> perhaps
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > proportional
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > data
> > > > > > > > > > > > > > > > > > volume. I
> > > > > > > > > > > > > > > > > > > > am
> > > > > > > > > > > > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > throttle
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > based
> > > > > on
> > > > > > > > > network
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > > > >> > >> whether
> > > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > covers
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > this
> > > > > > case.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > b)
> > > > At
> > > > > > the
> > > > > > > > > > moment,
> > > > > > > > > > > we
> > > > > > > > > > > > > > > record
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > > > > > > >> > >> quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > time.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > If a
> > > > > > quota
> > > > > > > > is
> > > > > > > > > > > > > violated,
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> example
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > disk
> > > > > > reads
> > > > > > > > for
> > > > > > > > > > > > fetches
> > > > > > > > > > > > > > > > > happening
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> network
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> record
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > delay
> > > > > a
> > > > > > > > > response
> > > > > > > > > > > > after
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > disk
> > > > > > > > > > > > > > > > > > > > reads.
> > > > > > > > > > > > > > > > > > > > > >> We
> > > > > > > > > > > > > > > > > > > > > >> > >> could
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > the
> > > > > > > network
> > > > > > > > > > thread
> > > > > > > > > > > > > when
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> complete
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> delay
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > handling a
> > > > > > > > > > > > subsequent
> > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > > (separate
> > > > > > > > > > > > > > > > > > > > > >> out
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > violation
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > handling
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > case
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> sense?
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > Regards,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > Rajini
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > On
> > > > > Tue,
> > > > > > > Feb
> > > > > > > > > 21,
> > > > > > > > > > > 2017
> > > > > > > > > > > > > at
> > > > > > > > > > > > > > > 2:58
> > > > > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > > > > > Becket
> > > > > > > > > > > > > > > > > > > > > >> > Qin <
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > Hey
> > > > > > Jay,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > Yeah,
> > > > > > I
> > > > > > > > > agree
> > > > > > > > > > > that
> > > > > > > > > > > > > > > > enforcing
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > > > > > >> > >> is a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > thinking
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > that
> > > > > > > maybe
> > > > > > > > > we
> > > > > > > > > > > can
> > > > > > > > > > > > > use
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > > > >> > request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > already
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > very
> > > > > > > > > detailed
> > > > > > > > > > so
> > > > > > > > > > > > we
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> it,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > e.g.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > > something
> > > > > > > > > like
> > > > > > > > > > > > > > > > (total_time -
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > request/response_queue_time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > remote_time).
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > I
> > > > > > agree
> > > > > > > > with
> > > > > > > > > > > > > Guozhang
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > > >> user is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> we
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > need
> > > > > > to
> > > > > > > > see
> > > > > > > > > if
> > > > > > > > > > > > > > anything
> > > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > > went
> > > > > > > > > > > > > > > > > > > > > wrong
> > > > > > > > > > > > > > > > > > > > > >> > >> first,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> well
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > behaving
> > > > > > > > and
> > > > > > > > > > > just
> > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > > >> resources, we
> > > > > > > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > for
> > > > > > > them.
> > > > > > > > It
> > > > > > > > > > is
> > > > > > > > > > > > true
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> precisely
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > users
> > > > > > is
> > > > > > > > > > > > difficult.
> > > > > > > > > > > > > So
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > practice
> > > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > >> > would
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > first
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > set
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > a
> > > > > > > relative
> > > > > > > > > > high
> > > > > > > > > > > > > > > protective
> > > > > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > > > >> quota
> > > > > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> increase
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > for
> > > > > > some
> > > > > > > > > > > > individual
> > > > > > > > > > > > > > > > clients
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > > demand.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > Jiangjie
> > > > > > > > > > > (Becket)
> > > > > > > > > > > > > Qin
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > On
> > > > > > Mon,
> > > > > > > > Feb
> > > > > > > > > > 20,
> > > > > > > > > > > > 2017
> > > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > 5:48
> > > > > > > > > > > > > > > > > > PM,
> > > > > > > > > > > > > > > > > > > > > >> Guozhang
> > > > > > > > > > > > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > wangguoz@gmail.com
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > wrote:
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > This
> > > > > > > is
> > > > > > > > a
> > > > > > > > > > > great
> > > > > > > > > > > > > > > > proposal,
> > > > > > > > > > > > > > > > > > glad
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> see
> > > > > > > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > I
> > > > > am
> > > > > > > > > > inclined
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > > > > >> throttling, or
> > > > > > > > > > > > > > > > > > > > > >> > >> more
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > processing
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > time
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > ratio
> > > > > > > > > > instead
> > > > > > > > > > > of
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> very
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > well
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > summed
> > > > > > > > my
> > > > > > > > > > > > > rationales
> > > > > > > > > > > > > > > > > above,
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > former
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > has
> > > > > > a
> > > > > > > > good
> > > > > > > > > > > > support
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > both
> > > > > > > > > > > > > > > > > > > > > >> "protecting
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > well
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > as
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > > > "utilizing a
> > > > > > > > > > > > > cluster
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> about
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > how
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > explain
> > > > > > > > > this
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > end
> > > > > > > > > > > > > > > > > > > users, I
> > > > > > > > > > > > > > > > > > > > > >> find
> > > > > > > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > request
> > > > > > > > > rate
> > > > > > > > > > > > since
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > mentioned
> > > > > > > > > > > > > > > > > > > > > >> above,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > quite
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > > different
> > > > > > > > > > > > "cost",
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > Kafka
> > > > > > > > > > > > > > > > > > > > today
> > > > > > > > > > > > > > > > > > > > > >> > already
> > > > > > > > > > > > > > > > > > > > > >> > >> > have
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > types
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > > (produce,
> > > > > > > > > > > fetch,
> > > > > > > > > > > > > > > admin,
> > > > > > > > > > > > > > > > > > > > metadata,
> > > > > > > > > > > > > > > > > > > > > >> etc),
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > request
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > rate
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > > throttling
> > > > > > > > > > may
> > > > > > > > > > > > not
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > > > > > > > > conservatively.
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > > Regarding
> > > > > > > > > to
> > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > reactions
> > > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > > >> they
> > > > > > > > > > > > > > > > > > > > > >> > are
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > > differ
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > > > > case-by-case,
> > > > > > > > > > > > and
> > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > >> > discovered /
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > relative
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > metrics.
> > > > > > > > > So
> > > > > > > > > > in
> > > > > > > > > > > > > other
> > > > > > > > > > > > > > > > words
> > > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > > >> would
> > > > > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > additional
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > > > information
> > > > > > > > > > by
> > > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > > being
> > > > > > > > > > > > > > > > > > > told
> > > > > > > > > > > > > > > > > > > > > >> "hey,
> > > > > > > > > > > > > > > > > > > > > >> > >> you
> > > > > > > > > > > > > > > > > > > > > >> > >> > are
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > all
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > what
> > > > > > > > > > > throttling
> > > > > > > > > > > > > > does;
> > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > >> > take a
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > "hmm,
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > > I'm
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > > throttled
> > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > ..",
> > > > > > > > > > > > > > > > > > > > > >> > which
> > > > > > > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > > > > > > >> > >> > by
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > metric
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> >
> > >
> > > >
> > > > > > > values:
> > > > > > > > > e.g.
> > > > > > > > > > > > > whether
> > > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > > > > > > bombarding
> > > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > > [Message clipped]
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > *Todd Palino*
> > > > > > > > > > > > Staff Site Reliability Engineer
> > > > > > > > > > > > Data Infrastructure Streaming
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > linkedin.com/in/toddpalino
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > *Todd Palino*
> > > > > > > > > > Staff Site Reliability Engineer
> > > > > > > > > > Data Infrastructure Streaming
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > linkedin.com/in/toddpalino
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Thank you, Jun.

Many thanks to everyone for the feedback and suggestions so far. If there
are any other suggestions or concerns, please do raise them on this thread.
Otherwise, I will start the vote early next week.

Regards,

Rajini


On Thu, Mar 16, 2017 at 11:48 AM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Rajini,
>
> Thanks for the updated KIP. It looks good to me now. Perhaps we can wait
> for a couple of more days to see if there are more comments and then start
> the vote?
>
> Jun
>
> On Thu, Mar 16, 2017 at 6:35 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Jun,
> >
> > 50. Yes, that makes sense. I have updated the KIP.
> >
> > Thank you,
> >
> > Rajini
> >
> > On Mon, Mar 13, 2017 at 7:35 PM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Rajini,
> > >
> > > Thanks for the updated KIP. Looks good. Just one more thing.
> > >
> > > 50. "Two new metrics request-throttle-time-max and
> > > request-throttle-time-min
> > >  will be added to reflect total request processing time based
> throttling
> > > for all request types including produce/fetch." The most important
> > clients
> > > are producer and consumer, which already have the
> > > produce/fetch-throttle-time-min/max
> > > metrics. Should we just accumulate the throttled time for other
> requests
> > > into these two existing metrics, instead of introducing new ones? We
> can
> > > probably add a similar metric for the admin client later on.
> > >
> > > Jun
> > >
> > >
> > > On Thu, Mar 9, 2017 at 2:24 PM, Rajini Sivaram <
> rajinisivaram@gmail.com>
> > > wrote:
> > >
> > > > Jun,
> > > >
> > > > 40. Yes you are right, a single value tracking the total exempt time
> is
> > > > sufficient. Have updated the KIP.
> > > >
> > > > Thank you,
> > > >
> > > > Rajini
> > > >
> > > > On Thu, Mar 9, 2017 at 9:42 PM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Hi, Rajini,
> > > > >
> > > > > The updated KIP looks good. Just one more comment.
> > > > >
> > > > > 40. "An additional metric exempt-request-time will also be added
> for
> > > each
> > > > > quota entity for the quota type Request." Should that metric be
> added
> > > for
> > > > > each entity type (e.g., user, client-id, etc)? It seems that value
> is
> > > > > independent of entity types.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Thu, Mar 9, 2017 at 12:07 PM, Rajini Sivaram <
> > > rajinisivaram@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi Jun,
> > > > > >
> > > > > > Thank you for reviewing the KIP again.
> > > > > >
> > > > > > 30. That is a good idea. In fact, it is one of the advantages of
> > > > > measuring
> > > > > > overall utilization rather than separate values for network and
> I/O
> > > > > threads
> > > > > > as I had intended initially. Have updated the KIP, thanks.
> > > > > >
> > > > > > 31. Added exempt-request-time metric.
> > > > > >
> > > > > > 32. I had thought of using quota.window.size.seconds *
> > > quota.window.num
> > > > > > initially, but felt that would be too big. Even the default of 11
> > > > seconds
> > > > > > is a rather long time to be throttled. With a limit of
> > > > > > quota.window.size.seconds, subsequent requests for that total
> > > interval
> > > > of
> > > > > > the samples will also each be throttled for
> > quota.window.size.seconds
> > > > if
> > > > > > the time recorded was very high. So limiting at
> > > > quota.window.size.seconds
> > > > > > limits the throttle time for an individual request, avoiding
> > timeouts
> > > > > where
> > > > > > possible, but still throttles over a period of time.
> > > > > >
> > > > > > 33. Updated to use request_percentage.
> > > > > >
> > > > > >
> > > > > > On Thu, Mar 9, 2017 at 5:40 PM, Jun Rao <ju...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Hi, Rajini,
> > > > > > >
> > > > > > > Thanks for the updated KIP. A few more comments.
> > > > > > >
> > > > > > > 30. Should we just account for the time in network threads in
> > this
> > > > KIP
> > > > > > too?
> > > > > > > The issue with doing this later is that existing quotas may be
> > too
> > > > > small
> > > > > > > and everyone will have to adjust them before upgrading, which
> is
> > > > > > > inconvenient. If we just do the delaying in the io threads,
> there
> > > > > > probably
> > > > > > > isn't too much additional work to include the network thread
> > time?
> > > > > > >
> > > > > > > 31. It would be useful for the new metrics to capture the
> > > utilization
> > > > > of
> > > > > > > all those requests exempt from request throttling (under sth
> like
> > > > > > > "exempt"). It's useful for an admin to know how much time is
> > spent
> > > > > there
> > > > > > > too.
> > > > > > >
> > > > > > > 32. "The maximum throttle time for any single request will be
> the
> > > > quota
> > > > > > > window size (one second by default)." We probably should cap
> the
> > > > delay
> > > > > at
> > > > > > > quota.window.size.seconds * quota.window.num?
> > > > > > >
> > > > > > > 33. It's unfortunate that we use . in configs and _ in ZK data
> > > > > > structures.
> > > > > > > However, for consistency, request.percentage in ZK probably
> > should
> > > be
> > > > > > > request_percentage?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Thu, Mar 9, 2017 at 7:55 AM, Rajini Sivaram <
> > > > > rajinisivaram@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > I have updated the KIP to use "request.percentage" quotas
> where
> > > the
> > > > > > > > percentage is out of a total of (num.io.threads * 100). I
> have
> > > > added
> > > > > > the
> > > > > > > > other options considered so far under "Rejected
> Alternatives".
> > > > > > > >
> > > > > > > > To address Todd's concern about per-thread quotas: Even
> though
> > > the
> > > > > > quotas
> > > > > > > > are out of (num.io.threads * 100)  clients are not locked
> into
> > > > > threads.
> > > > > > > > Utilization is measured as the total across all the I/O
> threads
> > > and
> > > > > 10
> > > > > > %
> > > > > > > > quota can be 1% of 10 threads. Individual quotas can also be
> > > > greater
> > > > > > than
> > > > > > > > 100% if required.
> > > > > > > >
> > > > > > > > Please let me know if there are any other concerns or
> > > suggestions.
> > > > > > > >
> > > > > > > > Thank you,
> > > > > > > >
> > > > > > > > Rajini
> > > > > > > >
> > > > > > > > On Wed, Mar 8, 2017 at 10:20 PM, Todd Palino <
> > tpalino@gmail.com>
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > Rajini -
> > > > > > > > >
> > > > > > > > > I understand what you’re saying, but the point I’m making
> is
> > > > that I
> > > > > > > don’t
> > > > > > > > > believe we need to take it into account directly. The CPU
> > > > > utilization
> > > > > > > of
> > > > > > > > > the network threads is directly proportional to the number
> of
> > > > bytes
> > > > > > > being
> > > > > > > > > sent. The more bytes, the more CPU that is required for SSL
> > (or
> > > > > other
> > > > > > > > > tasks). This is opposed to the request handler threads,
> where
> > > > there
> > > > > > > are a
> > > > > > > > > number of factors that affect CPU utilization. This means
> > that
> > > > it’s
> > > > > > not
> > > > > > > > > necessary to separately quota network thread byte usage and
> > > CPU -
> > > > > if
> > > > > > we
> > > > > > > > > quota byte usage (which we already do), we have fixed the
> CPU
> > > > usage
> > > > > > at
> > > > > > > a
> > > > > > > > > proportional amount.
> > > > > > > > >
> > > > > > > > > Jun -
> > > > > > > > >
> > > > > > > > > Thanks for the clarification there. I was thinking of the
> > > > > utilization
> > > > > > > > > percentage as being fixed, not what the percentage
> reflects.
> > > I’m
> > > > > not
> > > > > > > tied
> > > > > > > > > to either way of doing it, provided that we do not lock
> > clients
> > > > to
> > > > > a
> > > > > > > > single
> > > > > > > > > thread. For example, if I specify that a given client can
> use
> > > 10%
> > > > > of
> > > > > > a
> > > > > > > > > single thread, that should also mean they can use 1% on 10
> > > > threads.
> > > > > > > > >
> > > > > > > > > -Todd
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <ju...@confluent.io>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Todd,
> > > > > > > > > >
> > > > > > > > > > Thanks for the feedback.
> > > > > > > > > >
> > > > > > > > > > I just want to clarify your second point. If the limit
> > > > percentage
> > > > > > is
> > > > > > > > per
> > > > > > > > > > thread and the thread counts are changed, the absolute
> > > > processing
> > > > > > > limit
> > > > > > > > > for
> > > > > > > > > > existing users haven't changed and there is no need to
> > adjust
> > > > > them.
> > > > > > > On
> > > > > > > > > the
> > > > > > > > > > other hand, if the limit percentage is of total thread
> pool
> > > > > > capacity
> > > > > > > > and
> > > > > > > > > > the thread counts are changed, the effective processing
> > limit
> > > > > for a
> > > > > > > > user
> > > > > > > > > > will change. So, to preserve the current processing
> limit,
> > > > > existing
> > > > > > > > user
> > > > > > > > > > limits have to be adjusted. If there is a hardware
> change,
> > > the
> > > > > > > > effective
> > > > > > > > > > processing limit for a user will change in either
> approach
> > > and
> > > > > the
> > > > > > > > > existing
> > > > > > > > > > limit may need to be adjusted. However, hardware changes
> > are
> > > > less
> > > > > > > > common
> > > > > > > > > > than thread pool configuration changes.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > > On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <
> > > tpalino@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I’ve been following this one on and off, and overall it
> > > > sounds
> > > > > > good
> > > > > > > > to
> > > > > > > > > > me.
> > > > > > > > > > >
> > > > > > > > > > > - The SSL question is a good one. However, that type of
> > > > > overhead
> > > > > > > > should
> > > > > > > > > > be
> > > > > > > > > > > proportional to the bytes rate, so I think that a bytes
> > > rate
> > > > > > quota
> > > > > > > > > would
> > > > > > > > > > > still be a suitable way to address it.
> > > > > > > > > > >
> > > > > > > > > > > - I think it’s better to make the quota percentage of
> > total
> > > > > > thread
> > > > > > > > pool
> > > > > > > > > > > capacity, and not percentage of an individual thread.
> > That
> > > > way
> > > > > > you
> > > > > > > > > don’t
> > > > > > > > > > > have to adjust it when you adjust thread counts
> (tuning,
> > > > > hardware
> > > > > > > > > > changes,
> > > > > > > > > > > etc.)
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > -Todd
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <
> > > > > becket.qin@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > I see. Good point about SSL.
> > > > > > > > > > > >
> > > > > > > > > > > > I just asked Todd to take a look.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <
> > > jun@confluent.io>
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Jiangjie,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yes, I agree that byte rate already protects the
> > > network
> > > > > > > threads
> > > > > > > > > > > > > indirectly. I am not sure if byte rate fully
> captures
> > > the
> > > > > CPU
> > > > > > > > > > overhead
> > > > > > > > > > > in
> > > > > > > > > > > > > network due to SSL. So, at the high level, we can
> use
> > > > > request
> > > > > > > > time
> > > > > > > > > > > limit
> > > > > > > > > > > > to
> > > > > > > > > > > > > protect CPU and use byte rate to protect storage
> and
> > > > > network.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Also, do you think you can get Todd to comment on
> > this
> > > > KIP?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jun
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <
> > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi Rajini/Jun,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The percentage based reasoning sounds good.
> > > > > > > > > > > > > > One thing I am wondering is that if we assume the
> > > > network
> > > > > > > > thread
> > > > > > > > > > are
> > > > > > > > > > > > just
> > > > > > > > > > > > > > doing the network IO, can we say bytes rate quota
> > is
> > > > > > already
> > > > > > > > sort
> > > > > > > > > > of
> > > > > > > > > > > > > > network threads quota?
> > > > > > > > > > > > > > If we take network threads into the consideration
> > > here,
> > > > > > would
> > > > > > > > > that
> > > > > > > > > > be
> > > > > > > > > > > > > > somewhat overlapping with the bytes rate quota?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you for the explanation, I hadn't
> realized
> > > you
> > > > > > meant
> > > > > > > > > > > percentage
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > the total thread pool. If everyone is OK with
> > Jun's
> > > > > > > > > suggestion, I
> > > > > > > > > > > > will
> > > > > > > > > > > > > > > update the KIP.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <
> > > > > > jun@confluent.io>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Let's take your example. Let's say a user
> sets
> > > the
> > > > > > limit
> > > > > > > to
> > > > > > > > > > 50%.
> > > > > > > > > > > I
> > > > > > > > > > > > am
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > sure if it's better to apply the same
> > percentage
> > > > > > > separately
> > > > > > > > > to
> > > > > > > > > > > > > network
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > io thread pool. For example, for produce
> > > requests,
> > > > > most
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > time
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > spent in the io threads whereas for fetch
> > > requests,
> > > > > > most
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > time
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > be in the network threads. So, using the same
> > > > > > percentage
> > > > > > > in
> > > > > > > > > > both
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > pools means one of the pools' resource will
> be
> > > over
> > > > > > > > > allocated.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > An alternative way is to simply model network
> > and
> > > > io
> > > > > > > thread
> > > > > > > > > > pool
> > > > > > > > > > > > > > > together.
> > > > > > > > > > > > > > > > If you get 10 io threads and 5 network
> threads,
> > > you
> > > > > get
> > > > > > > > 1500%
> > > > > > > > > > > > request
> > > > > > > > > > > > > > > > processing power. A 50% limit means a total
> of
> > > 750%
> > > > > > > > > processing
> > > > > > > > > > > > power.
> > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > just add up the time a user request spent in
> > > either
> > > > > > > network
> > > > > > > > > or
> > > > > > > > > > io
> > > > > > > > > > > > > > thread.
> > > > > > > > > > > > > > > > If that total exceeds 750% (doesn't matter
> > > whether
> > > > > it's
> > > > > > > > spent
> > > > > > > > > > > more
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > network or io thread), the request will be
> > > > throttled.
> > > > > > > This
> > > > > > > > > > seems
> > > > > > > > > > > > more
> > > > > > > > > > > > > > > > general and is not sensitive to the current
> > > > > > > implementation
> > > > > > > > > > detail
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > having
> > > > > > > > > > > > > > > > a separate network and io thread pool. In the
> > > > future,
> > > > > > if
> > > > > > > > the
> > > > > > > > > > > > > threading
> > > > > > > > > > > > > > > > model changes, the same concept of quota can
> > > still
> > > > be
> > > > > > > > > applied.
> > > > > > > > > > > For
> > > > > > > > > > > > > now,
> > > > > > > > > > > > > > > > since it's a bit tricky to add the delay
> logic
> > in
> > > > the
> > > > > > > > network
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > > pool,
> > > > > > > > > > > > > > > > we could probably just do the delaying only
> in
> > > the
> > > > io
> > > > > > > > threads
> > > > > > > > > > as
> > > > > > > > > > > > you
> > > > > > > > > > > > > > > > suggested earlier.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > There is still the orthogonal question of
> > > whether a
> > > > > > quota
> > > > > > > > of
> > > > > > > > > > 50%
> > > > > > > > > > > is
> > > > > > > > > > > > > out
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > 100% or 100% * #total processing threads. My
> > > > feeling
> > > > > is
> > > > > > > > that
> > > > > > > > > > the
> > > > > > > > > > > > > latter
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > slightly better based on my explanation
> > earlier.
> > > > The
> > > > > > way
> > > > > > > to
> > > > > > > > > > > > describe
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > quota to the users can be "share of elapsed
> > > request
> > > > > > > > > processing
> > > > > > > > > > > time
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > single CPU" (similar to top).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini
> Sivaram
> > <
> > > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Agree about the two scenarios.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > But still not sure about a single quota
> > > covering
> > > > > both
> > > > > > > > > network
> > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > I/O threads with per-thread quota. If there
> > are
> > > > 10
> > > > > > I/O
> > > > > > > > > > threads
> > > > > > > > > > > > and
> > > > > > > > > > > > > 5
> > > > > > > > > > > > > > > > > network threads and I want to assign half
> the
> > > > quota
> > > > > > to
> > > > > > > > > userA,
> > > > > > > > > > > the
> > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > would be 750%. I imagine, internally, we
> > would
> > > > > > convert
> > > > > > > > this
> > > > > > > > > > to
> > > > > > > > > > > > 500%
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > and 250% for network threads to allocate
> 50%
> > of
> > > > > each
> > > > > > > > pool.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > A couple of scenarios:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 1. Admin adds 1 extra network thread. To
> > retain
> > > > > 50%,
> > > > > > > > admin
> > > > > > > > > > > needs
> > > > > > > > > > > > to
> > > > > > > > > > > > > > now
> > > > > > > > > > > > > > > > > allocate 800% for each user. Or increase
> the
> > > > quota
> > > > > > for
> > > > > > > a
> > > > > > > > > few
> > > > > > > > > > > > users.
> > > > > > > > > > > > > > To
> > > > > > > > > > > > > > > > me,
> > > > > > > > > > > > > > > > > it feels like admin needs to convert 50% to
> > > 800%
> > > > > and
> > > > > > > > Kafka
> > > > > > > > > > > > > internally
> > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > to convert 800% to (500%, 300%). Everyone
> > using
> > > > > just
> > > > > > > 50%
> > > > > > > > > > feels
> > > > > > > > > > > a
> > > > > > > > > > > > > lot
> > > > > > > > > > > > > > > > > simpler.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2. We decide to add some other thread to
> this
> > > > list.
> > > > > > > Admin
> > > > > > > > > > needs
> > > > > > > > > > > > to
> > > > > > > > > > > > > > know
> > > > > > > > > > > > > > > > > exactly how many threads form the maximum
> > > quota.
> > > > > And
> > > > > > we
> > > > > > > > can
> > > > > > > > > > be
> > > > > > > > > > > > > > changing
> > > > > > > > > > > > > > > > > this between broker versions as we add more
> > to
> > > > the
> > > > > > > list.
> > > > > > > > > > Again
> > > > > > > > > > > a
> > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > overall percent would be a lot simpler.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > There were others who were unconvinced by a
> > > > single
> > > > > > > > percent
> > > > > > > > > > from
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > initial
> > > > > > > > > > > > > > > > > proposal and were happier with thread units
> > > > similar
> > > > > > to
> > > > > > > > CPU
> > > > > > > > > > > units,
> > > > > > > > > > > > > so
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > am
> > > > > > > > > > > > > > > > > ok with going with per-thread quotas (as
> > units
> > > or
> > > > > > > > percent).
> > > > > > > > > > > Just
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > > sure
> > > > > > > > > > > > > > > > > it makes it easier for admin in all cases.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <
> > > > > > > > jun@confluent.io>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Consider modeling as n * 100% unit. For
> 2),
> > > the
> > > > > > > > question
> > > > > > > > > is
> > > > > > > > > > > > > what's
> > > > > > > > > > > > > > > > > causing
> > > > > > > > > > > > > > > > > > the I/O threads to be saturated. It's
> > > unlikely
> > > > > that
> > > > > > > all
> > > > > > > > > > > users'
> > > > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > > have increased at the same. A more likely
> > > case
> > > > is
> > > > > > > that
> > > > > > > > a
> > > > > > > > > > few
> > > > > > > > > > > > > > isolated
> > > > > > > > > > > > > > > > > > users' utilization have increased. If so,
> > > after
> > > > > > > > > increasing
> > > > > > > > > > > the
> > > > > > > > > > > > > > number
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > threads, the admin just needs to adjust
> the
> > > > quota
> > > > > > > for a
> > > > > > > > > few
> > > > > > > > > > > > > > isolated
> > > > > > > > > > > > > > > > > users,
> > > > > > > > > > > > > > > > > > which is expected and is less work.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Consider modeling as 1 * 100% unit. For
> 1),
> > > all
> > > > > > > users'
> > > > > > > > > > quota
> > > > > > > > > > > > need
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > adjusted, which is unexpected and is more
> > > work.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > So, to me, the n * 100% model seems more
> > > > > > convenient.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > As for future extension to cover network
> > > thread
> > > > > > > > > > utilization,
> > > > > > > > > > > I
> > > > > > > > > > > > > was
> > > > > > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > > > > > that one way is to simply model the
> > capacity
> > > as
> > > > > (n
> > > > > > +
> > > > > > > > m) *
> > > > > > > > > > > 100%
> > > > > > > > > > > > > > unit,
> > > > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > > > n and m are the number of network and i/o
> > > > > threads,
> > > > > > > > > > > > respectively.
> > > > > > > > > > > > > > > Then,
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > each user, we can just add up the
> > utilization
> > > > in
> > > > > > the
> > > > > > > > > > network
> > > > > > > > > > > > and
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > i/o
> > > > > > > > > > > > > > > > > > thread. If we do this, we don't need a
> new
> > > type
> > > > > of
> > > > > > > > quota.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini
> > > > Sivaram <
> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > If we use request.percentage as the
> > > > percentage
> > > > > > used
> > > > > > > > in
> > > > > > > > > a
> > > > > > > > > > > > single
> > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > > > the total percentage being allocated
> will
> > > be
> > > > > > > > > > > num.io.threads *
> > > > > > > > > > > > > 100
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > > threads and num.network.threads * 100
> for
> > > > > network
> > > > > > > > > > threads.
> > > > > > > > > > > A
> > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > > covering the two as a percentage
> wouldn't
> > > > quite
> > > > > > > work
> > > > > > > > if
> > > > > > > > > > you
> > > > > > > > > > > > > want
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > allocate the same proportion in both
> > cases.
> > > > If
> > > > > we
> > > > > > > > want
> > > > > > > > > to
> > > > > > > > > > > > treat
> > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > separate units, won't we need two quota
> > > > > > > > configurations
> > > > > > > > > > > > > regardless
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > whether we use units or percentage?
> > > Perhaps I
> > > > > > > > > > misunderstood
> > > > > > > > > > > > > your
> > > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I think there are two cases:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >    1. The use case that you mentioned
> > where
> > > > an
> > > > > > > admin
> > > > > > > > is
> > > > > > > > > > > > adding
> > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > >    and decides to add more I/O threads
> > and
> > > > > > expects
> > > > > > > to
> > > > > > > > > > find
> > > > > > > > > > > > free
> > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >    allocate for new users.
> > > > > > > > > > > > > > > > > > >    2. Admin adds more I/O threads
> because
> > > the
> > > > > I/O
> > > > > > > > > threads
> > > > > > > > > > > are
> > > > > > > > > > > > > > > > saturated
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >    there are cores available to
> allocate,
> > > > even
> > > > > > > though
> > > > > > > > > the
> > > > > > > > > > > > > number
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > If we allocated treated I/O threads as
> a
> > > > single
> > > > > > > unit
> > > > > > > > of
> > > > > > > > > > > 100%,
> > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > > > quotas need to be reallocated for 1).
> If
> > we
> > > > > > > allocated
> > > > > > > > > I/O
> > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > n
> > > > > > > > > > > > > > > > > > > units with n*100%, all user quotas need
> > to
> > > be
> > > > > > > > > reallocated
> > > > > > > > > > > for
> > > > > > > > > > > > > 2),
> > > > > > > > > > > > > > > > > > otherwise
> > > > > > > > > > > > > > > > > > > some of the new threads may just not be
> > > used.
> > > > > > > Either
> > > > > > > > > way
> > > > > > > > > > it
> > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > easy
> > > > > > > > > > > > > > > > > > > to write a script to decrease/increase
> > > quotas
> > > > > by
> > > > > > a
> > > > > > > > > > multiple
> > > > > > > > > > > > for
> > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > So it really boils down to which quota
> > unit
> > > > is
> > > > > > most
> > > > > > > > > > > intuitive
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > terms
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > configuration. And from the discussion
> so
> > > > far,
> > > > > it
> > > > > > > > feels
> > > > > > > > > > > like
> > > > > > > > > > > > > > > opinion
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > divided on whether quotas should be
> > carved
> > > > out
> > > > > of
> > > > > > > an
> > > > > > > > > > > absolute
> > > > > > > > > > > > > > 100%
> > > > > > > > > > > > > > > > (or
> > > > > > > > > > > > > > > > > 1
> > > > > > > > > > > > > > > > > > > unit) or be relative to the number of
> > > threads
> > > > > > > (n*100%
> > > > > > > > > or
> > > > > > > > > > n
> > > > > > > > > > > > > > units).
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun
> Rao <
> > > > > > > > > > jun@confluent.io>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Another way to express an absolute
> > limit
> > > is
> > > > > to
> > > > > > > use
> > > > > > > > > > > > > > > > > request.percentage,
> > > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > treat it as the percentage used in a
> > > single
> > > > > > > request
> > > > > > > > > > > > handling
> > > > > > > > > > > > > > > > thread.
> > > > > > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > > > > > now, the request handling threads can
> > be
> > > > just
> > > > > > the
> > > > > > > > io
> > > > > > > > > > > > threads.
> > > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > future, they can cover the network
> > > threads
> > > > as
> > > > > > > well.
> > > > > > > > > > This
> > > > > > > > > > > is
> > > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > top reports CPU usage and may be a
> bit
> > > > easier
> > > > > > for
> > > > > > > > > > people
> > > > > > > > > > > to
> > > > > > > > > > > > > > > > > understand.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun
> > > Rao <
> > > > > > > > > > > > jun@confluent.io>
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > 2. Regarding request.unit vs
> > > > > > > request.percentage.
> > > > > > > > I
> > > > > > > > > > > > started
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > > request.percentage too. The
> reasoning
> > > for
> > > > > > > > > > request.unit
> > > > > > > > > > > is
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > following.
> > > > > > > > > > > > > > > > > > > > > Suppose that the capacity has been
> > > > reached
> > > > > > on a
> > > > > > > > > > broker
> > > > > > > > > > > > and
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > > > > > to add a new user. A simple way to
> > > > increase
> > > > > > the
> > > > > > > > > > > capacity
> > > > > > > > > > > > is
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > increase
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > > number of io threads, assuming
> there
> > > are
> > > > > > still
> > > > > > > > > enough
> > > > > > > > > > > > > cores.
> > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > > > > is based on percentage, the
> > additional
> > > > > > capacity
> > > > > > > > > > > > > automatically
> > > > > > > > > > > > > > > > gets
> > > > > > > > > > > > > > > > > > > > > distributed to existing users and
> we
> > > > > haven't
> > > > > > > > really
> > > > > > > > > > > > carved
> > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > > > additional resource for the new
> user.
> > > > Now,
> > > > > is
> > > > > > > it
> > > > > > > > > easy
> > > > > > > > > > > > for a
> > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > reason
> > > > > > > > > > > > > > > > > > > > > about 0.1 unit vs 10%. My feeling
> is
> > > that
> > > > > > both
> > > > > > > > are
> > > > > > > > > > hard
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > > configured empirically. Not sure if
> > > > > > percentage
> > > > > > > is
> > > > > > > > > > > > obviously
> > > > > > > > > > > > > > > > easier
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > > reason about.
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM,
> Jay
> > > > Kreps
> > > > > <
> > > > > > > > > > > > > jay@confluent.io
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> 1. Even though the implementation
> of
> > > > this
> > > > > > > quota
> > > > > > > > is
> > > > > > > > > > > only
> > > > > > > > > > > > > > using
> > > > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > >> time, i think we should call it
> > > > something
> > > > > > like
> > > > > > > > > > > > > > "request-time".
> > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > >> give us flexibility to improve the
> > > > > > > > implementation
> > > > > > > > > to
> > > > > > > > > > > > cover
> > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > > >> in the future and will avoid
> > exposing
> > > > > > internal
> > > > > > > > > > details
> > > > > > > > > > > > > like
> > > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> 2. Jun/Roger, I get what you are
> > > trying
> > > > to
> > > > > > fix
> > > > > > > > but
> > > > > > > > > > the
> > > > > > > > > > > > > idea
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> thread/units
> > > > > > > > > > > > > > > > > > > > >> is super unintuitive as a
> > user-facing
> > > > > knob.
> > > > > > I
> > > > > > > > had
> > > > > > > > > to
> > > > > > > > > > > > read
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > > >> eight times to understand this.
> I'm
> > > not
> > > > > sure
> > > > > > > > that
> > > > > > > > > > your
> > > > > > > > > > > > > point
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> increasing the number of threads
> is
> > a
> > > > > > problem
> > > > > > > > > with a
> > > > > > > > > > > > > > > > > > percentage-based
> > > > > > > > > > > > > > > > > > > > >> value, it really depends on
> whether
> > > the
> > > > > user
> > > > > > > > > thinks
> > > > > > > > > > > > about
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > "percentage
> > > > > > > > > > > > > > > > > > > > >> of request processing time" or
> > "thread
> > > > > > units".
> > > > > > > > If
> > > > > > > > > > they
> > > > > > > > > > > > > think
> > > > > > > > > > > > > > > "I
> > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > >> allocated 10% of my request
> > processing
> > > > > time
> > > > > > to
> > > > > > > > > user
> > > > > > > > > > x"
> > > > > > > > > > > > > then
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > bug
> > > > > > > > > > > > > > > > > > > > >> that increasing the thread count
> > > > decreases
> > > > > > > that
> > > > > > > > > > > percent
> > > > > > > > > > > > as
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> current proposal. As a practical
> > > matter
> > > > I
> > > > > > > think
> > > > > > > > > the
> > > > > > > > > > > only
> > > > > > > > > > > > > way
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > actually
> > > > > > > > > > > > > > > > > > > > >> reason about this is as a
> > percent---I
> > > > just
> > > > > > > don't
> > > > > > > > > > > believe
> > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > going
> > > > > > > > > > > > > > > > > > > > >> to think, "ah, 4.3 thread units,
> > that
> > > is
> > > > > the
> > > > > > > > right
> > > > > > > > > > > > > amount!".
> > > > > > > > > > > > > > > > > > Instead I
> > > > > > > > > > > > > > > > > > > > >> think they have to understand this
> > > > thread
> > > > > > unit
> > > > > > > > > > > concept,
> > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > > >> they have set in number of
> threads,
> > > > > compute
> > > > > > a
> > > > > > > > > > percent
> > > > > > > > > > > > and
> > > > > > > > > > > > > > then
> > > > > > > > > > > > > > > > > come
> > > > > > > > > > > > > > > > > > up
> > > > > > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > > > > > >> the number of thread units, and
> > these
> > > > will
> > > > > > all
> > > > > > > > be
> > > > > > > > > > > wrong
> > > > > > > > > > > > if
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > >> count changes. I also think this
> > ties
> > > us
> > > > > to
> > > > > > > > > > throttling
> > > > > > > > > > > > the
> > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > >> pool,
> > > > > > > > > > > > > > > > > > > > >> which may not be where we want to
> > end
> > > > up.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> 3. For what it's worth I do think
> > > > having a
> > > > > > > > single
> > > > > > > > > > > > > > throttle_ms
> > > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > > > >> the responses that combines all
> > > > throttling
> > > > > > > from
> > > > > > > > > all
> > > > > > > > > > > > quotas
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> simplest. There could be a use
> case
> > > for
> > > > > > having
> > > > > > > > > > > separate
> > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > each,
> > > > > > > > > > > > > > > > > > > > >> but I think that is actually
> harder
> > to
> > > > > > > > use/monitor
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > common
> > > > > > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > > > >> unless someone has a use case I
> > think
> > > > just
> > > > > > one
> > > > > > > > > > should
> > > > > > > > > > > be
> > > > > > > > > > > > > > fine.
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM,
> > > Rajini
> > > > > > > Sivaram
> > > > > > > > <
> > > > > > > > > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > >> > I have updated the KIP based on
> > the
> > > > > > > > discussions
> > > > > > > > > so
> > > > > > > > > > > > far.
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > Regards,
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > Rajini
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29
> PM,
> > > > Rajini
> > > > > > > > > Sivaram <
> > > > > > > > > > > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > > Thank you all for the
> feedback.
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > Ismael #1. It makes sense not
> to
> > > > > > throttle
> > > > > > > > > > > > inter-broker
> > > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > > >> > > LeaderAndIsr etc. The simplest
> > way
> > > > to
> > > > > > > ensure
> > > > > > > > > > that
> > > > > > > > > > > > > > clients
> > > > > > > > > > > > > > > > > cannot
> > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > >> > these
> > > > > > > > > > > > > > > > > > > > >> > > requests to bypass quotas for
> > DoS
> > > > > > attacks
> > > > > > > is
> > > > > > > > > to
> > > > > > > > > > > > ensure
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > ACLs
> > > > > > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > > > > > >> > > clients from using these
> > requests
> > > > and
> > > > > > > > > > unauthorized
> > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > >> included
> > > > > > > > > > > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was
> > thinking
> > > > > that
> > > > > > > > these
> > > > > > > > > > > quotas
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > >> > separate
> > > > > > > > > > > > > > > > > > > > >> > > throttle time, and all
> > utilization
> > > > > based
> > > > > > > > > quotas
> > > > > > > > > > > > could
> > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > > > > > >> > > (we won't add another one for
> > > > network
> > > > > > > thread
> > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> instance).
> > > > > > > > > > > > > > > > > > > > >> > But
> > > > > > > > > > > > > > > > > > > > >> > > perhaps it makes sense to keep
> > > byte
> > > > > rate
> > > > > > > > > quotas
> > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > > > > > > > > > > >> > > responses to provide separate
> > > > metrics?
> > > > > > > Agree
> > > > > > > > > > with
> > > > > > > > > > > > > Ismael
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> name of
> > > > > > > > > > > > > > > > > > > > >> > > the existing field should be
> > > changed
> > > > > if
> > > > > > we
> > > > > > > > > have
> > > > > > > > > > > two.
> > > > > > > > > > > > > > Happy
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > switch
> > > > > > > > > > > > > > > > > > > > >> to a
> > > > > > > > > > > > > > > > > > > > >> > > single combined throttle time
> if
> > > > that
> > > > > is
> > > > > > > > > > > sufficient.
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > Ismael #4, #5, #6: Will update
> > > KIP.
> > > > > Will
> > > > > > > use
> > > > > > > > > dot
> > > > > > > > > > > > > > separated
> > > > > > > > > > > > > > > > > name
> > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> new
> > > > > > > > > > > > > > > > > > > > >> > > property. Replication quotas
> use
> > > dot
> > > > > > > > > separated,
> > > > > > > > > > so
> > > > > > > > > > > > it
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> consistent
> > > > > > > > > > > > > > > > > > > > >> > > with all properties except
> byte
> > > rate
> > > > > > > quotas.
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > Radai: #1 Request processing
> > time
> > > > > rather
> > > > > > > > than
> > > > > > > > > > > > request
> > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > > > > > chosen
> > > > > > > > > > > > > > > > > > > > >> > > because the time per request
> can
> > > > vary
> > > > > > > > > > > significantly
> > > > > > > > > > > > > > > between
> > > > > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > > > >> > > mentioned in the discussion
> and
> > > KIP.
> > > > > > > > > > > > > > > > > > > > >> > > #2 Two separate quotas for
> > > > > > > > heartbeats/regular
> > > > > > > > > > > > requests
> > > > > > > > > > > > > > > feel
> > > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > >> > > configuration and more
> metrics.
> > > > Since
> > > > > > most
> > > > > > > > > users
> > > > > > > > > > > > would
> > > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > > > > >> higher
> > > > > > > > > > > > > > > > > > > > >> > > than the expected usage and
> > quotas
> > > > are
> > > > > > > more
> > > > > > > > > of a
> > > > > > > > > > > > > safety
> > > > > > > > > > > > > > > > net, a
> > > > > > > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > > > > > > > > > > >> > >  #3 The number of requests in
> > > > > purgatory
> > > > > > is
> > > > > > > > > > limited
> > > > > > > > > > > > by
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > number
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> > active
> > > > > > > > > > > > > > > > > > > > >> > > connections since only one
> > request
> > > > per
> > > > > > > > > > connection
> > > > > > > > > > > > will
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > throttled
> > > > > > > > > > > > > > > > > > > > >> at a
> > > > > > > > > > > > > > > > > > > > >> > > time.
> > > > > > > > > > > > > > > > > > > > >> > > #4 As with byte rate quotas,
> to
> > > use
> > > > > the
> > > > > > > full
> > > > > > > > > > > > allocated
> > > > > > > > > > > > > > > > quotas,
> > > > > > > > > > > > > > > > > > > > >> > > clients/users would need to
> use
> > > > > > partitions
> > > > > > > > > that
> > > > > > > > > > > are
> > > > > > > > > > > > > > > > > distributed
> > > > > > > > > > > > > > > > > > > > across
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > > cluster. The alternative of
> > using
> > > > > > > > cluster-wide
> > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > instead
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> > per-broker
> > > > > > > > > > > > > > > > > > > > >> > > quotas would be far too
> complex
> > to
> > > > > > > > implement.
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > Dong : We currently have two
> > > > > > > > > ClientQuotaManagers
> > > > > > > > > > > for
> > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > types
> > > > > > > > > > > > > > > > > > > > Fetch
> > > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > > >> > > Produce. A new one will be
> added
> > > for
> > > > > > > > IOThread,
> > > > > > > > > > > which
> > > > > > > > > > > > > > > manages
> > > > > > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > > > > > >> > I/O
> > > > > > > > > > > > > > > > > > > > >> > > thread utilization. This will
> > not
> > > > > update
> > > > > > > the
> > > > > > > > > > Fetch
> > > > > > > > > > > > or
> > > > > > > > > > > > > > > > Produce
> > > > > > > > > > > > > > > > > > > > >> queue-size,
> > > > > > > > > > > > > > > > > > > > >> > > but will have a separate
> metric
> > > for
> > > > > the
> > > > > > > > > > > > queue-size.  I
> > > > > > > > > > > > > > > > wasn't
> > > > > > > > > > > > > > > > > > > > >> planning to
> > > > > > > > > > > > > > > > > > > > >> > > add any additional metrics
> apart
> > > > from
> > > > > > the
> > > > > > > > > > > equivalent
> > > > > > > > > > > > > > ones
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > > >> > > quotas as part of this KIP.
> > Ratio
> > > of
> > > > > > > > byte-rate
> > > > > > > > > > to
> > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > >> utilization
> > > > > > > > > > > > > > > > > > > > >> > > could be slightly misleading
> > since
> > > > it
> > > > > > > > depends
> > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > sequence
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> > requests.
> > > > > > > > > > > > > > > > > > > > >> > > But we can look into more
> > metrics
> > > > > after
> > > > > > > the
> > > > > > > > > KIP
> > > > > > > > > > is
> > > > > > > > > > > > > > > > implemented
> > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > >> > required.
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > I think we need to limit the
> > > maximum
> > > > > > delay
> > > > > > > > > since
> > > > > > > > > > > all
> > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > >> > > throttled. If a client has a
> > quota
> > > > of
> > > > > > > 0.001
> > > > > > > > > > units
> > > > > > > > > > > > and
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > >> > used
> > > > > > > > > > > > > > > > > > > > >> > > 50ms, we don't want to delay
> all
> > > > > > requests
> > > > > > > > from
> > > > > > > > > > the
> > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > > 50
> > > > > > > > > > > > > > > > > > > > >> seconds,
> > > > > > > > > > > > > > > > > > > > >> > > throwing the client out of all
> > its
> > > > > > > consumer
> > > > > > > > > > > groups.
> > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > issue
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > >> if
> > > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > > >> > > user is allocated a quota that
> > is
> > > > > > > > insufficient
> > > > > > > > > > to
> > > > > > > > > > > > > > process
> > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > > large
> > > > > > > > > > > > > > > > > > > > >> > > request. The expectation is
> that
> > > the
> > > > > > units
> > > > > > > > > > > allocated
> > > > > > > > > > > > > per
> > > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> > much
> > > > > > > > > > > > > > > > > > > > >> > > higher than the time taken to
> > > > process
> > > > > > one
> > > > > > > > > > request
> > > > > > > > > > > > and
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > > >> > > seldom be applied. Agree this
> > > needs
> > > > > > proper
> > > > > > > > > > > > > > documentation.
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04
> PM,
> > > > > radai <
> > > > > > > > > > > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> @jun: i wasnt concerned about
> > > tying
> > > > > up
> > > > > > a
> > > > > > > > > > request
> > > > > > > > > > > > > > > processing
> > > > > > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > > > > >> but
> > > > > > > > > > > > > > > > > > > > >> > >> IIUC the code does still read
> > the
> > > > > > entire
> > > > > > > > > > request
> > > > > > > > > > > > out,
> > > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > > might
> > > > > > > > > > > > > > > > > > > > >> add-up
> > > > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > > > >> > >> a non-negligible amount of
> > > memory.
> > > > > > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55
> > AM,
> > > > > Dong
> > > > > > > Lin
> > > > > > > > <
> > > > > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > The current KIP says that
> the
> > > > > maximum
> > > > > > > > delay
> > > > > > > > > > > will
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > reduced
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> window
> > > > > > > > > > > > > > > > > > > > >> > >> size
> > > > > > > > > > > > > > > > > > > > >> > >> > if it is larger than the
> > window
> > > > > > size. I
> > > > > > > > > have
> > > > > > > > > > a
> > > > > > > > > > > > > > concern
> > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > this:
> > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > 1) This essentially means
> > that
> > > > the
> > > > > > user
> > > > > > > > is
> > > > > > > > > > > > allowed
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > exceed
> > > > > > > > > > > > > > > > > > > > their
> > > > > > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > over a long period of time.
> > Can
> > > > you
> > > > > > > > provide
> > > > > > > > > > an
> > > > > > > > > > > > > upper
> > > > > > > > > > > > > > > > bound
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > 2) What is the motivation
> for
> > > cap
> > > > > the
> > > > > > > > > maximum
> > > > > > > > > > > > delay
> > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > window
> > > > > > > > > > > > > > > > > > > > >> > size?
> > > > > > > > > > > > > > > > > > > > >> > >> I
> > > > > > > > > > > > > > > > > > > > >> > >> > am wondering if there is
> > better
> > > > > > > > alternative
> > > > > > > > > > to
> > > > > > > > > > > > > > address
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > 3) It means that the
> existing
> > > > > > > > > metric-related
> > > > > > > > > > > > config
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > > > > > >> > >> > directly impact on the
> > > mechanism
> > > > of
> > > > > > > this
> > > > > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > > > > > >> quota.
> > > > > > > > > > > > > > > > > > > > >> > The
> > > > > > > > > > > > > > > > > > > > >> > >> > may be an important change
> > > > > depending
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > > answer
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > 1)
> > > > > > > > > > > > > > > > > > above.
> > > > > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > > >> > >> probably
> > > > > > > > > > > > > > > > > > > > >> > >> > need to document this more
> > > > > > explicitly.
> > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at
> 10:56
> > > AM,
> > > > > > Dong
> > > > > > > > Lin
> > > > > > > > > <
> > > > > > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > > Yeah you are right. I
> > thought
> > > > it
> > > > > > > wasn't
> > > > > > > > > > > because
> > > > > > > > > > > > > at
> > > > > > > > > > > > > > > > > LinkedIn
> > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > >> will
> > > > > > > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > > > > > >> > >> > > much pressure on inGraph
> to
> > > > > expose
> > > > > > > > those
> > > > > > > > > > > > > > per-clientId
> > > > > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > > > > >> > >> ended
> > > > > > > > > > > > > > > > > > > > >> > >> > > up printing them
> > periodically
> > > > to
> > > > > > > local
> > > > > > > > > log.
> > > > > > > > > > > > Never
> > > > > > > > > > > > > > > mind
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> not
> > > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > > - I agree with Jay that
> we
> > > > > probably
> > > > > > > > don't
> > > > > > > > > > > want
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > add a
> > > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > > > > > > > > >> > >> > > every quota
> ProduceResponse
> > > or
> > > > > > > > > > FetchResponse.
> > > > > > > > > > > > Is
> > > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > > >> use-case
> > > > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > > > >> > >> > > having separate
> > throttle-time
> > > > > > fields
> > > > > > > > for
> > > > > > > > > > > > > > > > byte-rate-quota
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-quota? You
> > > > > probably
> > > > > > > need
> > > > > > > > > to
> > > > > > > > > > > > > document
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > > >> > interface
> > > > > > > > > > > > > > > > > > > > >> > >> > > change if you plan to add
> > new
> > > > > field
> > > > > > > in
> > > > > > > > > any
> > > > > > > > > > > > > request.
> > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > > - I don't think IOThread
> > > > belongs
> > > > > to
> > > > > > > > > > > quotaType.
> > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > > > >> > >> types
> > > > > > > > > > > > > > > > > > > > >> > >> > > (i.e.
> > > > > > Produce/Fetch/LeaderReplicatio
> > > > > > > > > > > > > > > > > n/FollowerReplication)
> > > > > > > > > > > > > > > > > > > > >> identify
> > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > > type of request that are
> > > > > throttled,
> > > > > > > not
> > > > > > > > > the
> > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > > - If a request is
> throttled
> > > due
> > > > > to
> > > > > > > this
> > > > > > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > > > > > >> quota,
> > > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > > existing queue-size
> metric
> > in
> > > > > > > > > > > > ClientQuotaManager
> > > > > > > > > > > > > > > > > > incremented?
> > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > > - In the interest of
> > > providing
> > > > > > guide
> > > > > > > > line
> > > > > > > > > > for
> > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > decide
> > > > > > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-based
> quota
> > > and
> > > > > for
> > > > > > > user
> > > > > > > > > to
> > > > > > > > > > > > > > understand
> > > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > > > > impact
> > > > > > > > > > > > > > > > > > > > >> on
> > > > > > > > > > > > > > > > > > > > >> > >> their
> > > > > > > > > > > > > > > > > > > > >> > >> > > traffic, would it be
> useful
> > > to
> > > > > > have a
> > > > > > > > > > metric
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > shows
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> overall
> > > > > > > > > > > > > > > > > > > > >> > >> > > byte-rate per
> > io-thread-unit?
> > > > Can
> > > > > > we
> > > > > > > > also
> > > > > > > > > > > show
> > > > > > > > > > > > > > this a
> > > > > > > > > > > > > > > > > > > > >> per-clientId
> > > > > > > > > > > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at
> > 9:25
> > > > AM,
> > > > > > Jun
> > > > > > > > Rao
> > > > > > > > > <
> > > > > > > > > > > > > > > > > jun@confluent.io
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> For #3, typically, an
> > admin
> > > > > won't
> > > > > > > > > > configure
> > > > > > > > > > > > more
> > > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > > than
> > > > > > > > > > > > > > > > > > > > >> > CPU
> > > > > > > > > > > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> but it's possible for an
> > > admin
> > > > > to
> > > > > > > > start
> > > > > > > > > > with
> > > > > > > > > > > > > fewer
> > > > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > > > > > > > >> > >> cores
> > > > > > > > > > > > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> I think the throttleTime
> > > > sensor
> > > > > on
> > > > > > > the
> > > > > > > > > > > broker
> > > > > > > > > > > > > > tells
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> user/clentId is
> throttled
> > or
> > > > > not.
> > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> The reasoning for
> delaying
> > > the
> > > > > > > > throttled
> > > > > > > > > > > > > requests
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > > >> > >> instead
> > > > > > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> returning an error
> > > immediately
> > > > > is
> > > > > > > that
> > > > > > > > > the
> > > > > > > > > > > > > latter
> > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> client from retrying
> > > > > immediately,
> > > > > > > > which
> > > > > > > > > > will
> > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > things
> > > > > > > > > > > > > > > > > > > > worse.
> > > > > > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > > > > > >> > >> > >> delaying logic is based
> > off
> > > a
> > > > > > delay
> > > > > > > > > > queue. A
> > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > > expiration
> > > > > > > > > > > > > > > > > > > > >> > >> thread
> > > > > > > > > > > > > > > > > > > > >> > >> > >> just waits on the next
> to
> > be
> > > > > > expired
> > > > > > > > > > > request.
> > > > > > > > > > > > > So,
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > > > > tie
> > > > > > > > > > > > > > > > > > > > >> > up a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at
> > 9:07
> > > > AM,
> > > > > > > > Ismael
> > > > > > > > > > > Juma <
> > > > > > > > > > > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > Regarding 1, I
> > definitely
> > > > like
> > > > > > the
> > > > > > > > > > > > simplicity
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > keeping a
> > > > > > > > > > > > > > > > > > > > >> single
> > > > > > > > > > > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > time field in the
> > > response.
> > > > > The
> > > > > > > > > downside
> > > > > > > > > > > is
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > > > > > >> > metrics
> > > > > > > > > > > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > > > > > > > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > > > > > > > > > > percentage`
> > > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > `log.cleaner.min.cleanable.
> > > > > > > ratio`.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017
> at
> > > 4:43
> > > > > PM,
> > > > > > > Jay
> > > > > > > > > > > Kreps <
> > > > > > > > > > > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > A few minor
> comments:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    1. Isn't it the
> > case
> > > > that
> > > > > > the
> > > > > > > > > > > > throttling
> > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > > > > > >> > >> > should
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    the total time
> your
> > > > > request
> > > > > > > was
> > > > > > > > > > > > throttled
> > > > > > > > > > > > > > > > > > > irrespective
> > > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    caused that.
> > Limiting
> > > > it
> > > > > to
> > > > > > > > byte
> > > > > > > > > > rate
> > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    I don't think we
> > want
> > > > to
> > > > > > end
> > > > > > > up
> > > > > > > > > > > adding
> > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> response
> > > > > > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    single thing we
> > > quota,
> > > > > > right?
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    2. I don't think
> we
> > > > > should
> > > > > > > make
> > > > > > > > > > this
> > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > > specifically
> > > > > > > > > > > > > > > > > > > > >> > about
> > > > > > > > > > > > > > > > > > > > >> > >> io
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    threads. Once we
> > > > > introduce
> > > > > > > > these
> > > > > > > > > > > quotas
> > > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    be enforced (and
> if
> > > > they
> > > > > > > aren't
> > > > > > > > > it
> > > > > > > > > > > may
> > > > > > > > > > > > > > cause
> > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > outage).
> > > > > > > > > > > > > > > > > > > > >> As
> > > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    are a bit more
> > > > sensitive
> > > > > > than
> > > > > > > > > > normal
> > > > > > > > > > > > > > > configs, I
> > > > > > > > > > > > > > > > > > > think.
> > > > > > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > > > > > >> > >> > current
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    pools seem like
> > > > something
> > > > > > of
> > > > > > > an
> > > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > > detail
> > > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    user-facing
> quotas
> > > > should
> > > > > > be
> > > > > > > > > > involved
> > > > > > > > > > > > > > with. I
> > > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > >> might
> > > > > > > > > > > > > > > > > > > > >> > >> be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    make this a
> general
> > > > > > > > request-time
> > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > > > >> mention in
> > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    about I/O threads
> > and
> > > > > > simply
> > > > > > > > > > > > acknowledge
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    may someday fix)
> in
> > > the
> > > > > > docs
> > > > > > > > that
> > > > > > > > > > > this
> > > > > > > > > > > > > > covers
> > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > > > > >> > >> after
> > > > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    thread is read
> off
> > > the
> > > > > > > network.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    3. As such I
> think
> > > the
> > > > > > right
> > > > > > > > > > > interface
> > > > > > > > > > > > to
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    like
> > > > percent_request_time
> > > > > > and
> > > > > > > > be
> > > > > > > > > in
> > > > > > > > > > > > > > > {0,...100}
> > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0}
> (I
> > > > think
> > > > > > > > "ratio"
> > > > > > > > > > is
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > terminology
> > > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > >> used
> > > > > > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >    is between 0 and
> 1
> > in
> > > > the
> > > > > > > other
> > > > > > > > > > > > metrics,
> > > > > > > > > > > > > > > > right?)
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017
> > at
> > > > 3:45
> > > > > > AM,
> > > > > > > > > > Rajini
> > > > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Thank you for the
> > > > > feedback.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang : I have
> > > > updated
> > > > > > the
> > > > > > > > > > section
> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > co-existence
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> byte
> > > > > > > > > > > > > > > > > > > > >> > >> rate
> > > > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > request time
> quotas.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't
> added
> > > > much
> > > > > > > detail
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> sensors
> > > > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > going to be very
> > > similar
> > > > > to
> > > > > > > the
> > > > > > > > > > > existing
> > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> sensors.
> > > > > > > > > > > > > > > > > > > > >> > >> To
> > > > > > > > > > > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > confusion, I have
> > now
> > > > > added
> > > > > > > more
> > > > > > > > > > > detail.
> > > > > > > > > > > > > All
> > > > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > "quotaType" and
> all
> > > > > sensors
> > > > > > > have
> > > > > > > > > > names
> > > > > > > > > > > > > > > starting
> > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > quotaType is
> > > > > Produce/Fetch/
> > > > > > > > > > > > > > LeaderReplication/
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > FollowerReplication/*IOThread*
> > > > > > > > ).
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > So there will be
> no
> > > > reuse
> > > > > of
> > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > metrics/sensors.
> > > > > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > > > >> > new
> > > > > > > > > > > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > request processing
> > > time
> > > > > > based
> > > > > > > > > > > throttling
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> completely
> > > > > > > > > > > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > existing
> > > > metrics/sensors,
> > > > > > but
> > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > > > > > consistent
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> format.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > The existing
> > > > > > throttle_time_ms
> > > > > > > > > field
> > > > > > > > > > in
> > > > > > > > > > > > > > > > > produce/fetch
> > > > > > > > > > > > > > > > > > > > >> > responses
> > > > > > > > > > > > > > > > > > > > >> > >> > will
> > > > > > > > > > > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > impacted by this
> > KIP.
> > > > That
> > > > > > > will
> > > > > > > > > > > continue
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > > > > >> byte-rate
> > > > > > > > > > > > > > > > > > > > >> > >> based
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > throttling times.
> In
> > > > > > > addition, a
> > > > > > > > > new
> > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > added to return
> > > request
> > > > > > quota
> > > > > > > > > based
> > > > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > > > > times.
> > > > > > > > > > > > > > > > > > > > >> These
> > > > > > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > as new metrics on
> > the
> > > > > > > > client-side.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Since all metrics
> > and
> > > > > > sensors
> > > > > > > > are
> > > > > > > > > > > > > different
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > believe there is
> > > already
> > > > > > > > > sufficient
> > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > monitor
> > > > > > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > client and broker
> > side
> > > > for
> > > > > > > each
> > > > > > > > > type
> > > > > > > > > > > of
> > > > > > > > > > > > > > > > > throttling.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23,
> 2017
> > > at
> > > > > 4:32
> > > > > > > AM,
> > > > > > > > > > Dong
> > > > > > > > > > > > Lin
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > I think it
> makes a
> > > lot
> > > > > of
> > > > > > > > sense
> > > > > > > > > to
> > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > io_thread_units
> > > > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > > > >> > >> metric
> > > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > user's traffic
> > here.
> > > > > LGTM
> > > > > > > > > > overall. I
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > > questions
> > > > > > > > > > > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Can you be
> more
> > > > > specific
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > KIP
> > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > sensors
> > > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > example, it will
> > be
> > > > > useful
> > > > > > > to
> > > > > > > > > > > specify
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > name
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - We currently
> > have
> > > > > > > > > throttle-time
> > > > > > > > > > > and
> > > > > > > > > > > > > > > > queue-size
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Are you going to
> > > have
> > > > > > > separate
> > > > > > > > > > > > > > throttle-time
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> queue-size
> > > > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > throttled by
> > > > > > > > > io_thread_unit-based
> > > > > > > > > > > > quota,
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > > > >> share
> > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Does the
> > > > throttle-time
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > > > > > > ProduceResponse
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > time due to
> > > > > > > > io_thread_unit-based
> > > > > > > > > > > > quota?
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Currently
> kafka
> > > > server
> > > > > > > > doesn't
> > > > > > > > > > not
> > > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > log
> > > > > > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > whether any
> given
> > > > > clientId
> > > > > > > (or
> > > > > > > > > > user)
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > throttled.
> > > > > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > because we can
> > still
> > > > > check
> > > > > > > the
> > > > > > > > > > > > > client-side
> > > > > > > > > > > > > > > > > > byte-rate
> > > > > > > > > > > > > > > > > > > > >> metric
> > > > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > whether a given
> > > client
> > > > > is
> > > > > > > > > > throttled.
> > > > > > > > > > > > But
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > will be no way
> to
> > > > > validate
> > > > > > > > > > whether a
> > > > > > > > > > > > > given
> > > > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> slow
> > > > > > > > > > > > > > > > > > > > >> > >> > because
> > > > > > > > > > > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > exceeded its
> > > > > > io_thread_unit
> > > > > > > > > limit.
> > > > > > > > > > > It
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > necessary
> > > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> user
> > > > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > know this
> > > information
> > > > to
> > > > > > > > figure
> > > > > > > > > > how
> > > > > > > > > > > > > > whether
> > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > >> > reached
> > > > > > > > > > > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > limit. How about
> > we
> > > > add
> > > > > > > log4j
> > > > > > > > > log
> > > > > > > > > > on
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > server
> > > > > > > > > > > > > > > > > > side
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > > > > > > > > > > > byte-rate-throttle-time,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > io-thread-unit-throttle-time)
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > that kafka
> > > > administrator
> > > > > > can
> > > > > > > > > > figure
> > > > > > > > > > > > > those
> > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> have
> > > > > > > > > > > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > limit and act
> > > > > accordingly?
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22,
> > 2017
> > > > at
> > > > > > 4:46
> > > > > > > > PM,
> > > > > > > > > > > > > Guozhang
> > > > > > > > > > > > > > > > Wang <
> > > > > > > > > > > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Made a pass
> over
> > > the
> > > > > > doc,
> > > > > > > > > > overall
> > > > > > > > > > > > LGTM
> > > > > > > > > > > > > > > > except
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > minor
> > > > > > > > > > > > > > > > > > > > >> > >> comment
> > > > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttling
> > > > > > implementation:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Stated as
> > "Request
> > > > > > > > processing
> > > > > > > > > > time
> > > > > > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > necessary." I
> > > > thought
> > > > > > that
> > > > > > > > it
> > > > > > > > > > > meant
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > >> > processing
> > > > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > is applied
> > first,
> > > > but
> > > > > > > > continue
> > > > > > > > > > > > > reading I
> > > > > > > > > > > > > > > > found
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > >> > actually
> > > > > > > > > > > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > produce /
> fetch
> > > byte
> > > > > > rate
> > > > > > > > > > > throttling
> > > > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Also the last
> > > > sentence
> > > > > > > "The
> > > > > > > > > > > > remaining
> > > > > > > > > > > > > > > delay
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > response." is
> a
> > > bit
> > > > > > > > confusing
> > > > > > > > > to
> > > > > > > > > > > me.
> > > > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > > > rewording
> > > > > > > > > > > > > > > > > > > > >> it a
> > > > > > > > > > > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb
> 22,
> > > 2017
> > > > > at
> > > > > > > 3:24
> > > > > > > > > PM,
> > > > > > > > > > > Jun
> > > > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Thanks for
> the
> > > > > updated
> > > > > > > > KIP.
> > > > > > > > > > The
> > > > > > > > > > > > > latest
> > > > > > > > > > > > > > > > > > proposal
> > > > > > > > > > > > > > > > > > > > >> looks
> > > > > > > > > > > > > > > > > > > > >> > >> good
> > > > > > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb
> > 22,
> > > > 2017
> > > > > > at
> > > > > > > > 2:19
> > > > > > > > > > PM,
> > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Thank you
> > for
> > > > the
> > > > > > > > > feedback.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 1. I have
> > > > updated
> > > > > > the
> > > > > > > > KIP
> > > > > > > > > to
> > > > > > > > > > > use
> > > > > > > > > > > > > > > > absolute
> > > > > > > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> is
> > > > > called*
> > > > > > > > > > > > > io_thread_units*
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > align
> > > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> > > > > > > > *num.io.threads*.
> > > > > > > > > > > When
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > implement
> > > > > > > > > > > > > > > > > > > > >> network
> > > > > > > > > > > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas, we
> > can
> > > > add
> > > > > > > > another
> > > > > > > > > > > > > property
> > > > > > > > > > > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 2.
> > > > > > ControlledShutdown
> > > > > > > is
> > > > > > > > > > > already
> > > > > > > > > > > > > > > listed
> > > > > > > > > > > > > > > > > > under
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > you mean a
> > > > > different
> > > > > > > > > request
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> added?
> > > > > > > > > > > > > > > > > > > > >> > >> The
> > > > > > > > > > > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > currently
> > > exempt
> > > > > in
> > > > > > > the
> > > > > > > > > KIP
> > > > > > > > > > > are
> > > > > > > > > > > > > > > > > StopReplica,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> LeaderAndIsr
> > > and
> > > > > > > > > > > UpdateMetadata.
> > > > > > > > > > > > > > These
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > >> controlled
> > > > > > > > > > > > > > > > > > > > >> > >> > using
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it
> > is
> > > > easy
> > > > > > to
> > > > > > > > > > exclude
> > > > > > > > > > > > and
> > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > sure if
> > there
> > > > are
> > > > > > > other
> > > > > > > > > > > requests
> > > > > > > > > > > > > > used
> > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > be
> excluded.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was
> > > > thinking
> > > > > > the
> > > > > > > > > > smallest
> > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> replace
> > > > > > > > > > > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > *requestChannel.sendResponse()
> > > > > > > > > > > *
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > local
> > > > > > > > > > > > > > > > > > > > >> method
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > *sendResponseMaybeThrottle()*
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > throttling
> > > > > > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > response.
> If
> > > we
> > > > > > > throttle
> > > > > > > > > > first
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > within the
> > > > method
> > > > > > > > handling
> > > > > > > > > > the
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> throttling.
> > We
> > > > can
> > > > > > > look
> > > > > > > > > into
> > > > > > > > > > > > this
> > > > > > > > > > > > > > > again
> > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> PR
> > > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > On Wed,
> Feb
> > > 22,
> > > > > 2017
> > > > > > > at
> > > > > > > > > 5:55
> > > > > > > > > > > PM,
> > > > > > > > > > > > > > Roger
> > > > > > > > > > > > > > > > > > Hoover
> > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > roger.hoover@gmail.com>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Great to
> > see
> > > > > this
> > > > > > > KIP
> > > > > > > > > and
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > excellent
> > > > > > > > > > > > > > > > > > > > >> discussion.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > To me,
> > Jun's
> > > > > > > > suggestion
> > > > > > > > > > > makes
> > > > > > > > > > > > > > sense.
> > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > > my
> > > > > > > > > > > > > > > > > > > > >> > >> application
> > > > > > > > > > > > > > > > > > > > >> > >> > is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > handler
> > > > > > > unit,
> > > > > > > > > then
> > > > > > > > > > > > it's
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > > > > > >> > Kafka
> > > > > > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > handler
> > > > > > > thread
> > > > > > > > > > > > dedicated
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > me.
> > > > > > > > > > > > > > > > > > > > That's
> > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > least.
> > That
> > > > > > > > allocation
> > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > > > even
> > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > >> an
> > > > > > > > > > > > > > > > > > > > >> > >> admin
> > > > > > > > > > > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > size of
> > the
> > > > > > request
> > > > > > > > > thread
> > > > > > > > > > > > pool
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > broker.
> > > > > > > > > > > > > > > > > > > > >> > It's
> > > > > > > > > > > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > abstraction
> > > > that
> > > > > > VMs
> > > > > > > > and
> > > > > > > > > > > > > > containers
> > > > > > > > > > > > > > > > get
> > > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> schedulers.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > While
> > > > different
> > > > > > > client
> > > > > > > > > > > access
> > > > > > > > > > > > > > > patterns
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > >> > wildly
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > thread
> > > > > > > > resources
> > > > > > > > > > per
> > > > > > > > > > > > > > > request,
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > given
> > > > > > > > > > > > > > > > > > > > >> > >> > application
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > have a
> > > stable
> > > > > > access
> > > > > > > > > > pattern
> > > > > > > > > > > > and
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > "request
> > > > thread
> > > > > > > units"
> > > > > > > > > it
> > > > > > > > > > > > needs
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > meet
> > > > > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > On Wed,
> > Feb
> > > > 22,
> > > > > > 2017
> > > > > > > > at
> > > > > > > > > > 8:53
> > > > > > > > > > > > AM,
> > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi,
> > > Rajini,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks
> > for
> > > > the
> > > > > > > > updated
> > > > > > > > > > > KIP.
> > > > > > > > > > > > A
> > > > > > > > > > > > > > few
> > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > >> comments.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A
> > > concern
> > > > > of
> > > > > > > > > > > > > > > request_time_percent
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> it's
> > > > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > > > >> > >> > an
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Let's
> > say
> > > > you
> > > > > > > give a
> > > > > > > > > > user
> > > > > > > > > > > a
> > > > > > > > > > > > > 10%
> > > > > > > > > > > > > > > > limit.
> > > > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > admin
> > > > > > > > > > > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> request
> > > > > handler
> > > > > > > > > threads,
> > > > > > > > > > > > that
> > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > now
> > > > > > > > > > > > > > > > > > > > >> actually
> > > > > > > > > > > > > > > > > > > > >> > has
> > > > > > > > > > > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > capacity.
> > > > This
> > > > > > may
> > > > > > > > > > confuse
> > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > bit.
> > > > > > > > > > > > > > > > > > > So,
> > > > > > > > > > > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > based
> on
> > > an
> > > > > > > absolute
> > > > > > > > > > > request
> > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > unit
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> > better.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > > > > > > > > > > ControlledShutdownRequest
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > also
> > > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > excluded
> > > > > from
> > > > > > > > > > > throttling.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 3.
> > > > > > Implementation
> > > > > > > > > wise,
> > > > > > > > > > I
> > > > > > > > > > > am
> > > > > > > > > > > > > > > > wondering
> > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > time
> > > > > throttling
> > > > > > > > first
> > > > > > > > > in
> > > > > > > > > > > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > throttling
> > > > > > > logic
> > > > > > > > > in
> > > > > > > > > > > each
> > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > request.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> Thanks,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > On
> Wed,
> > > Feb
> > > > > 22,
> > > > > > > 2017
> > > > > > > > > at
> > > > > > > > > > > 5:58
> > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> Thank
> > > you
> > > > > for
> > > > > > > the
> > > > > > > > > > > review.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I
> have
> > > > > > reverted
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > > > > original
> > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> handler
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > utilization.
> > > > > > At
> > > > > > > > the
> > > > > > > > > > > > moment,
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > uses
> > > > > > > > > > > > > > > > > > > > >> percentage,
> > > > > > > > > > > > > > > > > > > > >> > >> but
> > > > > > > > > > > > > > > > > > > > >> > >> > I
> > > > > > > > > > > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > a
> > > fraction
> > > > > > (out
> > > > > > > > of 1
> > > > > > > > > > > > instead
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > 100)
> > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> examples
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > from
> > > this
> > > > > > > > discussion
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > KIP.
> > > > > > > > > > > > > > > > > Also
> > > > > > > > > > > > > > > > > > > > added
> > > > > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > address
> > > > > > network
> > > > > > > > > thread
> > > > > > > > > > > > > > > > utilization.
> > > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > "request_time_percent"
> > > > > > > > > > > > with
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > expectation
> > > > > > > > > > > > > > > > > > > > >> > that
> > > > > > > > > > > > > > > > > > > > >> > >> it
> > > > > > > > > > > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> limit
> > > for
> > > > > > > network
> > > > > > > > > > thread
> > > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> users
> > > have
> > > > > to
> > > > > > > set
> > > > > > > > > only
> > > > > > > > > > > one
> > > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > > >> > >> > not
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > internal
> > > > > > > > > > > distribution
> > > > > > > > > > > > of
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > work
> > > > > > > > > > > > > > > > > > > > >> between
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > two
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> Kafka.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > Regards,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> Rajini
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > On
> > Wed,
> > > > Feb
> > > > > > 22,
> > > > > > > > 2017
> > > > > > > > > > at
> > > > > > > > > > > > > 12:23
> > > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > Rao
> > > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> Hi,
> > > > > Rajini,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > Thanks
> > > > for
> > > > > > the
> > > > > > > > > > > proposal.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> The
> > > > > benefit
> > > > > > of
> > > > > > > > > using
> > > > > > > > > > > the
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > >> processing
> > > > > > > > > > > > > > > > > > > > >> > >> time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > exactly
> > > > > what
> > > > > > > > > people
> > > > > > > > > > > have
> > > > > > > > > > > > > > > said. I
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > > > > >> > >> expand
> > > > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > following
> > > > > > > case.
> > > > > > > > > The
> > > > > > > > > > > > > producer
> > > > > > > > > > > > > > > > > sends a
> > > > > > > > > > > > > > > > > > > > >> produce
> > > > > > > > > > > > > > > > > > > > >> > >> > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> message
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> but
> > > > > > compressed
> > > > > > > > to
> > > > > > > > > > > 100KB
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > gzip.
> > > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > broker
> > > > > could
> > > > > > > > take
> > > > > > > > > > > 10-15
> > > > > > > > > > > > > > > seconds,
> > > > > > > > > > > > > > > > > > > during
> > > > > > > > > > > > > > > > > > > > >> which
> > > > > > > > > > > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > thread
> > > > is
> > > > > > > > > completely
> > > > > > > > > > > > > > blocked.
> > > > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > >> case,
> > > > > > > > > > > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> the
> > > > > request
> > > > > > > rate
> > > > > > > > > > quota
> > > > > > > > > > > > may
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > Consider
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > another
> > > > > > case.
> > > > > > > A
> > > > > > > > > > > consumer
> > > > > > > > > > > > > > group
> > > > > > > > > > > > > > > > > > starts
> > > > > > > > > > > > > > > > > > > > >> with 10
> > > > > > > > > > > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > switches
> > > > > to
> > > > > > 20
> > > > > > > > > > > > instances.
> > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > > > > >> > will
> > > > > > > > > > > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > actually
> > > > > > load
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > > double
> > > > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > contains
> > > > > > half
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > > > > partitions.
> > > > > > > > > > > > > > > > > > > Request
> > > > > > > > > > > > > > > > > > > > >> rate
> > > > > > > > > > > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > configure
> > > > > in
> > > > > > > > this
> > > > > > > > > > > case.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> What
> > > we
> > > > > > really
> > > > > > > > > want
> > > > > > > > > > is
> > > > > > > > > > > > to
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > of
> > the
> > > > > > server
> > > > > > > > side
> > > > > > > > > > > > > > resources.
> > > > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > >> > >> particular
> > > > > > > > > > > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > capacity
> > > > > of
> > > > > > > the
> > > > > > > > > > > request
> > > > > > > > > > > > > > > handler
> > > > > > > > > > > > > > > > > > > > threads. I
> > > > > > > > > > > > > > > > > > > > >> > >> agree
> > > > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > intuitive
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > users
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > determine
> > > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > > >> > set
> > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> However,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> this
> > > is
> > > > > not
> > > > > > > > > > completely
> > > > > > > > > > > > new
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > > > been
> > > > > > > > > > > > > > > > > > > > >> done
> > > > > > > > > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > already.
> > > > > For
> > > > > > > > > > example,
> > > > > > > > > > > > > Linux
> > > > > > > > > > > > > > > > > cgroup (
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > https://access.redhat.com/
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > Resource_Management_Guide/sec-
> > > > > > > > > > > > > > > > > > > cpu.html)
> > > > > > > > > > > > > > > > > > > > >> has
> > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > which
> > > > > > > specifies
> > > > > > > > > the
> > > > > > > > > > > > total
> > > > > > > > > > > > > > > amount
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > tasks
> > > > in a
> > > > > > > > cgroup
> > > > > > > > > > can
> > > > > > > > > > > > run
> > > > > > > > > > > > > > > > during a
> > > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > > > >> second
> > > > > > > > > > > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > potentially
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > model
> > > > the
> > > > > > > > request
> > > > > > > > > > > > handler
> > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > > > > > > >> > similar
> > > > > > > > > > > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > request
> > > > > > > handler
> > > > > > > > > > thread
> > > > > > > > > > > > can
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > 1
> > > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > >> > handler
> > > > > > > > > > > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > configure
> > > > > a
> > > > > > > > limit
> > > > > > > > > on
> > > > > > > > > > > how
> > > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > > > > (say
> > > > > > > > > > > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > > > > > > > > > > >> > >> a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > Regarding
> > > > > > not
> > > > > > > > > > > throttling
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > internal
> > > > > > > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > do
> > > that.
> > > > > > > > > > > Alternatively,
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > > > > let
> > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> limit
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> for
> > > the
> > > > > > kafka
> > > > > > > > user
> > > > > > > > > > (it
> > > > > > > > > > > > may
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> do
> > > > > > > > > > > > > > > > > > > > >> > >> that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > clientId
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > though).
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > Ideally
> > > > we
> > > > > > > want
> > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > able
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > protect
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> thread
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> pool
> > > > too.
> > > > > > The
> > > > > > > > > > > difficult
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > > >> Rajini
> > > > > > > > > > > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > throttling
> > > > > > the
> > > > > > > > > > > requests
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > through
> > > > > > > > > > > > > > > > > > > > >> Purgatory
> > > > > > > > > > > > > > > > > > > > >> > >> and
> > > > > > > > > > > > > > > > > > > > >> > >> > we
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > through
> > > > > how
> > > > > > to
> > > > > > > > > > > integrate
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > network
> > > > > > > > > > > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> network
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > layer,
> > > > > > > currently
> > > > > > > > > we
> > > > > > > > > > > know
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > user,
> > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > it's a
> > > > bit
> > > > > > > > tricky
> > > > > > > > > to
> > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > > based
> > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > >> > clientId
> > > > > > > > > > > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> byteOut
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > quota
> > > > can
> > > > > > > > already
> > > > > > > > > > > > protect
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > > >> thread
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > requests.
> > > > > > So,
> > > > > > > if
> > > > > > > > > we
> > > > > > > > > > > > can't
> > > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > >> > part
> > > > > > > > > > > > > > > > > > > > >> > >> > right
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> the
> > > > > request
> > > > > > > > > handling
> > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > Thanks,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> Jun
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > On
> > > Tue,
> > > > > Feb
> > > > > > > 21,
> > > > > > > > > 2017
> > > > > > > > > > > at
> > > > > > > > > > > > > 4:27
> > > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > Thank
> > > > > you
> > > > > > > all
> > > > > > > > > for
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > feedback.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > Jay: I
> > > > > > have
> > > > > > > > > > removed
> > > > > > > > > > > > > > > exemption
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> consumer
> > > > > > > > > > > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > protecting
> > > > > > > the
> > > > > > > > > > > cluster
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > important
> > > > > > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> apps.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > Have
> > > > > > > retained
> > > > > > > > > the
> > > > > > > > > > > > > > exemption
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > StopReplicat/LeaderAndIsr
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > throttled
> > > > > > > only
> > > > > > > > > if
> > > > > > > > > > > > > > > > authorization
> > > > > > > > > > > > > > > > > > > fails
> > > > > > > > > > > > > > > > > > > > >> (so
> > > > > > > > > > > > > > > > > > > > >> > >> can't
> > > > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> attacks
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> a
> > > > secure
> > > > > > > > > cluster,
> > > > > > > > > > > but
> > > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > > > >> inter-broker
> > > > > > > > > > > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> without
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > delays).
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> I
> > > will
> > > > > > wait
> > > > > > > > > > another
> > > > > > > > > > > > day
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > > these
> > > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > > >> > >> any
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> based
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > request
> > > > > > > > > processing
> > > > > > > > > > > > time
> > > > > > > > > > > > > > (as
> > > > > > > > > > > > > > > > > > opposed
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > objections,
> > > > > > > I
> > > > > > > > > will
> > > > > > > > > > > > > revert
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> original
> > > > > > > > > > > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > The
> > > > > > original
> > > > > > > > > > > proposal
> > > > > > > > > > > > > was
> > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > > including
> > > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > handler
> > > > > > > > threads
> > > > > > > > > > > (that
> > > > > > > > > > > > > made
> > > > > > > > > > > > > > > > > > > calculation
> > > > > > > > > > > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> suggestion
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > include
> > > > > > the
> > > > > > > > time
> > > > > > > > > > > spent
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > significant.
> > > > > > > > As
> > > > > > > > > > Jay
> > > > > > > > > > > > > > pointed
> > > > > > > > > > > > > > > > out,
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> calculate
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > total
> > > > > > > > available
> > > > > > > > > > CPU
> > > > > > > > > > > > time
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > convert
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > threads
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > and
> > > > *n*
> > > > > > > > network
> > > > > > > > > > > > threads.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> what
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> we
> > > > want,
> > > > > > but
> > > > > > > > it
> > > > > > > > > > can
> > > > > > > > > > > be
> > > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > > expensive
> > > > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > >> > some
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > Guozhang
> > > > > > > have
> > > > > > > > > > > pointed
> > > > > > > > > > > > > out,
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > >> > several
> > > > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> already
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > generating
> > > > > > > > > metrics
> > > > > > > > > > > > that
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > use,
> > > > > > > > > > > > > > > > > > > > >> > though
> > > > > > > > > > > > > > > > > > > > >> > >> we
> > > > > > > > > > > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > nanoTime()
> > > > > > > > > instead
> > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > small
> > > > > > > requests
> > > > > > > > > may
> > > > > > > > > > > be
> > > > > > > > > > > > <
> > > > > > > > > > > > > > 1ms.
> > > > > > > > > > > > > > > > But
> > > > > > > > > > > > > > > > > > > > rather
> > > > > > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > > > > > >> > >> add
> > > > > > > > > > > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > thread
> > > > > and
> > > > > > > > > network
> > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > > wouldn't
> > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > > >> > >> better
> > > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > spent
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> on
> > > > each
> > > > > > > thread
> > > > > > > > > > into
> > > > > > > > > > > a
> > > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > ratio?
> > > > > > > > > > > > > > > > > > > > >> UserA
> > > > > > > > > > > > > > > > > > > > >> > >> has
> > > > > > > > > > > > > > > > > > > > >> > >> > a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> Can
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> we
> > > > take
> > > > > > that
> > > > > > > > to
> > > > > > > > > > mean
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > UserA
> > > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > > >> 5%
> > > > > > > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > threads
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > and
> > > 5%
> > > > > of
> > > > > > > the
> > > > > > > > > time
> > > > > > > > > > > on
> > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > threads?
> > > > > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > > > > >> > either
> > > > > > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> response
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > throttled
> > > > > > -
> > > > > > > it
> > > > > > > > > > would
> > > > > > > > > > > > > mean
> > > > > > > > > > > > > > > > > > > maintaining
> > > > > > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > > > > > >> > >> sets
> > > > > > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > durations,
> > > > > > > but
> > > > > > > > > > would
> > > > > > > > > > > > > > result
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> two
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > quota
> > > > > > limits
> > > > > > > > > > (UserA
> > > > > > > > > > > > has
> > > > > > > > > > > > > 5%
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > >> > threads
> > > > > > > > > > > > > > > > > > > > >> > >> > and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > threads),
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > but
> > > > that
> > > > > > > seems
> > > > > > > > > > > > > unnecessary
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > harder
> > > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> explain
> > > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > Back
> > > > to
> > > > > > why
> > > > > > > > and
> > > > > > > > > > how
> > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > applied
> > > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > > >> > >> > network
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > utilization:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> a)
> > > In
> > > > > the
> > > > > > > case
> > > > > > > > > of
> > > > > > > > > > > > fetch,
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > > >> spent in
> > > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > significant
> > > > > > > > and
> > > > > > > > > I
> > > > > > > > > > > can
> > > > > > > > > > > > > see
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > include
> > > > > > > > > > > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > requests
> > > > > > > where
> > > > > > > > > the
> > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> of
> > > > > fetch,
> > > > > > > > > request
> > > > > > > > > > > > > handler
> > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > >> > utilization
> > > > > > > > > > > > > > > > > > > > >> > >> > would
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > high
> > > > > > request
> > > > > > > > > rate,
> > > > > > > > > > > low
> > > > > > > > > > > > > > data
> > > > > > > > > > > > > > > > > volume
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> > fetch
> > > > > > > > > > > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > throttle
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > clients
> > > > > > with
> > > > > > > > > high
> > > > > > > > > > > data
> > > > > > > > > > > > > > > volume.
> > > > > > > > > > > > > > > > > > > Network
> > > > > > > > > > > > > > > > > > > > >> > thread
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > proportional
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > data
> > > > > > > > > > > > > > > > > volume. I
> > > > > > > > > > > > > > > > > > > am
> > > > > > > > > > > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > throttle
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > based
> > > > on
> > > > > > > > network
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > > >> > >> whether
> > > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > covers
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > this
> > > > > case.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> b)
> > > At
> > > > > the
> > > > > > > > > moment,
> > > > > > > > > > we
> > > > > > > > > > > > > > record
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > > > > > >> > >> quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > time.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > If a
> > > > > quota
> > > > > > > is
> > > > > > > > > > > > violated,
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > disk
> > > > > reads
> > > > > > > for
> > > > > > > > > > > fetches
> > > > > > > > > > > > > > > > happening
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> network
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > delay
> > > > a
> > > > > > > > response
> > > > > > > > > > > after
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > disk
> > > > > > > > > > > > > > > > > > > reads.
> > > > > > > > > > > > > > > > > > > > >> We
> > > > > > > > > > > > > > > > > > > > >> > >> could
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > the
> > > > > > network
> > > > > > > > > thread
> > > > > > > > > > > > when
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> complete
> > > > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > handling a
> > > > > > > > > > > subsequent
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > > (separate
> > > > > > > > > > > > > > > > > > > > >> out
> > > > > > > > > > > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > violation
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > handling
> > > > > > in
> > > > > > > > the
> > > > > > > > > > case
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > Regards,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > Rajini
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> On
> > > > Tue,
> > > > > > Feb
> > > > > > > > 21,
> > > > > > > > > > 2017
> > > > > > > > > > > > at
> > > > > > > > > > > > > > 2:58
> > > > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > > > > Becket
> > > > > > > > > > > > > > > > > > > > >> > Qin <
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > Hey
> > > > > Jay,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > Yeah,
> > > > > I
> > > > > > > > agree
> > > > > > > > > > that
> > > > > > > > > > > > > > > enforcing
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > > > > >> > >> is a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > thinking
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > that
> > > > > > maybe
> > > > > > > > we
> > > > > > > > > > can
> > > > > > > > > > > > use
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > > >> > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > already
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > very
> > > > > > > > detailed
> > > > > > > > > so
> > > > > > > > > > > we
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > e.g.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > > something
> > > > > > > > like
> > > > > > > > > > > > > > > (total_time -
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > request/response_queue_time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > remote_time).
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > I
> > > > > agree
> > > > > > > with
> > > > > > > > > > > > Guozhang
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > > >> user is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > need
> > > > > to
> > > > > > > see
> > > > > > > > if
> > > > > > > > > > > > > anything
> > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > went
> > > > > > > > > > > > > > > > > > > > wrong
> > > > > > > > > > > > > > > > > > > > >> > >> first,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > behaving
> > > > > > > and
> > > > > > > > > > just
> > > > > > > > > > > > need
> > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > > >> resources, we
> > > > > > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > for
> > > > > > them.
> > > > > > > It
> > > > > > > > > is
> > > > > > > > > > > true
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > users
> > > > > is
> > > > > > > > > > > difficult.
> > > > > > > > > > > > So
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > practice
> > > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > > >> > would
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> first
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > set
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > a
> > > > > > relative
> > > > > > > > > high
> > > > > > > > > > > > > > protective
> > > > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > > >> quota
> > > > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > for
> > > > > some
> > > > > > > > > > > individual
> > > > > > > > > > > > > > > clients
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > > demand.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > Thanks,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > Jiangjie
> > > > > > > > > > (Becket)
> > > > > > > > > > > > Qin
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > On
> > > > > Mon,
> > > > > > > Feb
> > > > > > > > > 20,
> > > > > > > > > > > 2017
> > > > > > > > > > > > > at
> > > > > > > > > > > > > > > 5:48
> > > > > > > > > > > > > > > > > PM,
> > > > > > > > > > > > > > > > > > > > >> Guozhang
> > > > > > > > > > > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > wangguoz@gmail.com
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > wrote:
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > This
> > > > > > is
> > > > > > > a
> > > > > > > > > > great
> > > > > > > > > > > > > > > proposal,
> > > > > > > > > > > > > > > > > glad
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> see
> > > > > > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > I
> > > > am
> > > > > > > > > inclined
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > > > >> throttling, or
> > > > > > > > > > > > > > > > > > > > >> > >> more
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > processing
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > time
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > ratio
> > > > > > > > > instead
> > > > > > > > > > of
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > well
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > summed
> > > > > > > my
> > > > > > > > > > > > rationales
> > > > > > > > > > > > > > > > above,
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > former
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > has
> > > > > a
> > > > > > > good
> > > > > > > > > > > support
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > both
> > > > > > > > > > > > > > > > > > > > >> "protecting
> > > > > > > > > > > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> well
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> as
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > > > "utilizing a
> > > > > > > > > > > > cluster
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> how
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > explain
> > > > > > > > this
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > end
> > > > > > > > > > > > > > > > > > users, I
> > > > > > > > > > > > > > > > > > > > >> find
> > > > > > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > request
> > > > > > > > rate
> > > > > > > > > > > since
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > mentioned
> > > > > > > > > > > > > > > > > > > > >> above,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > quite
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > > different
> > > > > > > > > > > "cost",
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > Kafka
> > > > > > > > > > > > > > > > > > > today
> > > > > > > > > > > > > > > > > > > > >> > already
> > > > > > > > > > > > > > > > > > > > >> > >> > have
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> types
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > > (produce,
> > > > > > > > > > fetch,
> > > > > > > > > > > > > > admin,
> > > > > > > > > > > > > > > > > > > metadata,
> > > > > > > > > > > > > > > > > > > > >> etc),
> > > > > > > > > > > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> request
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > rate
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > > throttling
> > > > > > > > > may
> > > > > > > > > > > not
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > conservatively.
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > > Regarding
> > > > > > > > to
> > > > > > > > > > > user
> > > > > > > > > > > > > > > > reactions
> > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > > >> they
> > > > > > > > > > > > > > > > > > > > >> > are
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > > differ
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > > > > case-by-case,
> > > > > > > > > > > and
> > > > > > > > > > > > > need
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > >> > discovered /
> > > > > > > > > > > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > relative
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > metrics.
> > > > > > > > So
> > > > > > > > > in
> > > > > > > > > > > > other
> > > > > > > > > > > > > > > words
> > > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > > >> would
> > > > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > additional
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > > > information
> > > > > > > > > by
> > > > > > > > > > > > > simply
> > > > > > > > > > > > > > > > being
> > > > > > > > > > > > > > > > > > told
> > > > > > > > > > > > > > > > > > > > >> "hey,
> > > > > > > > > > > > > > > > > > > > >> > >> you
> > > > > > > > > > > > > > > > > > > > >> > >> > are
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> all
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > what
> > > > > > > > > > throttling
> > > > > > > > > > > > > does;
> > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > >> > take a
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > "hmm,
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > > I'm
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > > throttled
> > > > > > > > > > > probably
> > > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > ..",
> > > > > > > > > > > > > > > > > > > > >> > which
> > > > > > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > > > > > >> > >> > by
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > metric
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> >
> > >
> > > > > > values:
> > > > > > > > e.g.
> > > > > > > > > > > > whether
> > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > > > > > bombarding
> > > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > > [Message clipped]
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > *Todd Palino*
> > > > > > > > > > > Staff Site Reliability Engineer
> > > > > > > > > > > Data Infrastructure Streaming
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > linkedin.com/in/toddpalino
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > *Todd Palino*
> > > > > > > > > Staff Site Reliability Engineer
> > > > > > > > > Data Infrastructure Streaming
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > linkedin.com/in/toddpalino
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Rajini,

Thanks for the updated KIP. It looks good to me now. Perhaps we can wait
for a couple of more days to see if there are more comments and then start
the vote?

Jun

On Thu, Mar 16, 2017 at 6:35 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Jun,
>
> 50. Yes, that makes sense. I have updated the KIP.
>
> Thank you,
>
> Rajini
>
> On Mon, Mar 13, 2017 at 7:35 PM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Rajini,
> >
> > Thanks for the updated KIP. Looks good. Just one more thing.
> >
> > 50. "Two new metrics request-throttle-time-max and
> > request-throttle-time-min
> >  will be added to reflect total request processing time based throttling
> > for all request types including produce/fetch." The most important
> clients
> > are producer and consumer, which already have the
> > produce/fetch-throttle-time-min/max
> > metrics. Should we just accumulate the throttled time for other requests
> > into these two existing metrics, instead of introducing new ones? We can
> > probably add a similar metric for the admin client later on.
> >
> > Jun
> >
> >
> > On Thu, Mar 9, 2017 at 2:24 PM, Rajini Sivaram <ra...@gmail.com>
> > wrote:
> >
> > > Jun,
> > >
> > > 40. Yes you are right, a single value tracking the total exempt time is
> > > sufficient. Have updated the KIP.
> > >
> > > Thank you,
> > >
> > > Rajini
> > >
> > > On Thu, Mar 9, 2017 at 9:42 PM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Rajini,
> > > >
> > > > The updated KIP looks good. Just one more comment.
> > > >
> > > > 40. "An additional metric exempt-request-time will also be added for
> > each
> > > > quota entity for the quota type Request." Should that metric be added
> > for
> > > > each entity type (e.g., user, client-id, etc)? It seems that value is
> > > > independent of entity types.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Thu, Mar 9, 2017 at 12:07 PM, Rajini Sivaram <
> > rajinisivaram@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi Jun,
> > > > >
> > > > > Thank you for reviewing the KIP again.
> > > > >
> > > > > 30. That is a good idea. In fact, it is one of the advantages of
> > > > measuring
> > > > > overall utilization rather than separate values for network and I/O
> > > > threads
> > > > > as I had intended initially. Have updated the KIP, thanks.
> > > > >
> > > > > 31. Added exempt-request-time metric.
> > > > >
> > > > > 32. I had thought of using quota.window.size.seconds *
> > quota.window.num
> > > > > initially, but felt that would be too big. Even the default of 11
> > > seconds
> > > > > is a rather long time to be throttled. With a limit of
> > > > > quota.window.size.seconds, subsequent requests for that total
> > interval
> > > of
> > > > > the samples will also each be throttled for
> quota.window.size.seconds
> > > if
> > > > > the time recorded was very high. So limiting at
> > > quota.window.size.seconds
> > > > > limits the throttle time for an individual request, avoiding
> timeouts
> > > > where
> > > > > possible, but still throttles over a period of time.
> > > > >
> > > > > 33. Updated to use request_percentage.
> > > > >
> > > > >
> > > > > On Thu, Mar 9, 2017 at 5:40 PM, Jun Rao <ju...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Rajini,
> > > > > >
> > > > > > Thanks for the updated KIP. A few more comments.
> > > > > >
> > > > > > 30. Should we just account for the time in network threads in
> this
> > > KIP
> > > > > too?
> > > > > > The issue with doing this later is that existing quotas may be
> too
> > > > small
> > > > > > and everyone will have to adjust them before upgrading, which is
> > > > > > inconvenient. If we just do the delaying in the io threads, there
> > > > > probably
> > > > > > isn't too much additional work to include the network thread
> time?
> > > > > >
> > > > > > 31. It would be useful for the new metrics to capture the
> > utilization
> > > > of
> > > > > > all those requests exempt from request throttling (under sth like
> > > > > > "exempt"). It's useful for an admin to know how much time is
> spent
> > > > there
> > > > > > too.
> > > > > >
> > > > > > 32. "The maximum throttle time for any single request will be the
> > > quota
> > > > > > window size (one second by default)." We probably should cap the
> > > delay
> > > > at
> > > > > > quota.window.size.seconds * quota.window.num?
> > > > > >
> > > > > > 33. It's unfortunate that we use . in configs and _ in ZK data
> > > > > structures.
> > > > > > However, for consistency, request.percentage in ZK probably
> should
> > be
> > > > > > request_percentage?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Thu, Mar 9, 2017 at 7:55 AM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > I have updated the KIP to use "request.percentage" quotas where
> > the
> > > > > > > percentage is out of a total of (num.io.threads * 100). I have
> > > added
> > > > > the
> > > > > > > other options considered so far under "Rejected Alternatives".
> > > > > > >
> > > > > > > To address Todd's concern about per-thread quotas: Even though
> > the
> > > > > quotas
> > > > > > > are out of (num.io.threads * 100)  clients are not locked into
> > > > threads.
> > > > > > > Utilization is measured as the total across all the I/O threads
> > and
> > > > 10
> > > > > %
> > > > > > > quota can be 1% of 10 threads. Individual quotas can also be
> > > greater
> > > > > than
> > > > > > > 100% if required.
> > > > > > >
> > > > > > > Please let me know if there are any other concerns or
> > suggestions.
> > > > > > >
> > > > > > > Thank you,
> > > > > > >
> > > > > > > Rajini
> > > > > > >
> > > > > > > On Wed, Mar 8, 2017 at 10:20 PM, Todd Palino <
> tpalino@gmail.com>
> > > > > wrote:
> > > > > > >
> > > > > > > > Rajini -
> > > > > > > >
> > > > > > > > I understand what you’re saying, but the point I’m making is
> > > that I
> > > > > > don’t
> > > > > > > > believe we need to take it into account directly. The CPU
> > > > utilization
> > > > > > of
> > > > > > > > the network threads is directly proportional to the number of
> > > bytes
> > > > > > being
> > > > > > > > sent. The more bytes, the more CPU that is required for SSL
> (or
> > > > other
> > > > > > > > tasks). This is opposed to the request handler threads, where
> > > there
> > > > > > are a
> > > > > > > > number of factors that affect CPU utilization. This means
> that
> > > it’s
> > > > > not
> > > > > > > > necessary to separately quota network thread byte usage and
> > CPU -
> > > > if
> > > > > we
> > > > > > > > quota byte usage (which we already do), we have fixed the CPU
> > > usage
> > > > > at
> > > > > > a
> > > > > > > > proportional amount.
> > > > > > > >
> > > > > > > > Jun -
> > > > > > > >
> > > > > > > > Thanks for the clarification there. I was thinking of the
> > > > utilization
> > > > > > > > percentage as being fixed, not what the percentage reflects.
> > I’m
> > > > not
> > > > > > tied
> > > > > > > > to either way of doing it, provided that we do not lock
> clients
> > > to
> > > > a
> > > > > > > single
> > > > > > > > thread. For example, if I specify that a given client can use
> > 10%
> > > > of
> > > > > a
> > > > > > > > single thread, that should also mean they can use 1% on 10
> > > threads.
> > > > > > > >
> > > > > > > > -Todd
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <ju...@confluent.io>
> > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Todd,
> > > > > > > > >
> > > > > > > > > Thanks for the feedback.
> > > > > > > > >
> > > > > > > > > I just want to clarify your second point. If the limit
> > > percentage
> > > > > is
> > > > > > > per
> > > > > > > > > thread and the thread counts are changed, the absolute
> > > processing
> > > > > > limit
> > > > > > > > for
> > > > > > > > > existing users haven't changed and there is no need to
> adjust
> > > > them.
> > > > > > On
> > > > > > > > the
> > > > > > > > > other hand, if the limit percentage is of total thread pool
> > > > > capacity
> > > > > > > and
> > > > > > > > > the thread counts are changed, the effective processing
> limit
> > > > for a
> > > > > > > user
> > > > > > > > > will change. So, to preserve the current processing limit,
> > > > existing
> > > > > > > user
> > > > > > > > > limits have to be adjusted. If there is a hardware change,
> > the
> > > > > > > effective
> > > > > > > > > processing limit for a user will change in either approach
> > and
> > > > the
> > > > > > > > existing
> > > > > > > > > limit may need to be adjusted. However, hardware changes
> are
> > > less
> > > > > > > common
> > > > > > > > > than thread pool configuration changes.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > > On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <
> > tpalino@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I’ve been following this one on and off, and overall it
> > > sounds
> > > > > good
> > > > > > > to
> > > > > > > > > me.
> > > > > > > > > >
> > > > > > > > > > - The SSL question is a good one. However, that type of
> > > > overhead
> > > > > > > should
> > > > > > > > > be
> > > > > > > > > > proportional to the bytes rate, so I think that a bytes
> > rate
> > > > > quota
> > > > > > > > would
> > > > > > > > > > still be a suitable way to address it.
> > > > > > > > > >
> > > > > > > > > > - I think it’s better to make the quota percentage of
> total
> > > > > thread
> > > > > > > pool
> > > > > > > > > > capacity, and not percentage of an individual thread.
> That
> > > way
> > > > > you
> > > > > > > > don’t
> > > > > > > > > > have to adjust it when you adjust thread counts (tuning,
> > > > hardware
> > > > > > > > > changes,
> > > > > > > > > > etc.)
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > -Todd
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <
> > > > becket.qin@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I see. Good point about SSL.
> > > > > > > > > > >
> > > > > > > > > > > I just asked Todd to take a look.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <
> > jun@confluent.io>
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi, Jiangjie,
> > > > > > > > > > > >
> > > > > > > > > > > > Yes, I agree that byte rate already protects the
> > network
> > > > > > threads
> > > > > > > > > > > > indirectly. I am not sure if byte rate fully captures
> > the
> > > > CPU
> > > > > > > > > overhead
> > > > > > > > > > in
> > > > > > > > > > > > network due to SSL. So, at the high level, we can use
> > > > request
> > > > > > > time
> > > > > > > > > > limit
> > > > > > > > > > > to
> > > > > > > > > > > > protect CPU and use byte rate to protect storage and
> > > > network.
> > > > > > > > > > > >
> > > > > > > > > > > > Also, do you think you can get Todd to comment on
> this
> > > KIP?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jun
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <
> > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi Rajini/Jun,
> > > > > > > > > > > > >
> > > > > > > > > > > > > The percentage based reasoning sounds good.
> > > > > > > > > > > > > One thing I am wondering is that if we assume the
> > > network
> > > > > > > thread
> > > > > > > > > are
> > > > > > > > > > > just
> > > > > > > > > > > > > doing the network IO, can we say bytes rate quota
> is
> > > > > already
> > > > > > > sort
> > > > > > > > > of
> > > > > > > > > > > > > network threads quota?
> > > > > > > > > > > > > If we take network threads into the consideration
> > here,
> > > > > would
> > > > > > > > that
> > > > > > > > > be
> > > > > > > > > > > > > somewhat overlapping with the bytes rate quota?
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thank you for the explanation, I hadn't realized
> > you
> > > > > meant
> > > > > > > > > > percentage
> > > > > > > > > > > > of
> > > > > > > > > > > > > > the total thread pool. If everyone is OK with
> Jun's
> > > > > > > > suggestion, I
> > > > > > > > > > > will
> > > > > > > > > > > > > > update the KIP.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <
> > > > > jun@confluent.io>
> > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Let's take your example. Let's say a user sets
> > the
> > > > > limit
> > > > > > to
> > > > > > > > > 50%.
> > > > > > > > > > I
> > > > > > > > > > > am
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > > sure if it's better to apply the same
> percentage
> > > > > > separately
> > > > > > > > to
> > > > > > > > > > > > network
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > io thread pool. For example, for produce
> > requests,
> > > > most
> > > > > > of
> > > > > > > > the
> > > > > > > > > > time
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > spent in the io threads whereas for fetch
> > requests,
> > > > > most
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > time
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be in the network threads. So, using the same
> > > > > percentage
> > > > > > in
> > > > > > > > > both
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > > pools means one of the pools' resource will be
> > over
> > > > > > > > allocated.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > An alternative way is to simply model network
> and
> > > io
> > > > > > thread
> > > > > > > > > pool
> > > > > > > > > > > > > > together.
> > > > > > > > > > > > > > > If you get 10 io threads and 5 network threads,
> > you
> > > > get
> > > > > > > 1500%
> > > > > > > > > > > request
> > > > > > > > > > > > > > > processing power. A 50% limit means a total of
> > 750%
> > > > > > > > processing
> > > > > > > > > > > power.
> > > > > > > > > > > > > We
> > > > > > > > > > > > > > > just add up the time a user request spent in
> > either
> > > > > > network
> > > > > > > > or
> > > > > > > > > io
> > > > > > > > > > > > > thread.
> > > > > > > > > > > > > > > If that total exceeds 750% (doesn't matter
> > whether
> > > > it's
> > > > > > > spent
> > > > > > > > > > more
> > > > > > > > > > > in
> > > > > > > > > > > > > > > network or io thread), the request will be
> > > throttled.
> > > > > > This
> > > > > > > > > seems
> > > > > > > > > > > more
> > > > > > > > > > > > > > > general and is not sensitive to the current
> > > > > > implementation
> > > > > > > > > detail
> > > > > > > > > > > of
> > > > > > > > > > > > > > having
> > > > > > > > > > > > > > > a separate network and io thread pool. In the
> > > future,
> > > > > if
> > > > > > > the
> > > > > > > > > > > > threading
> > > > > > > > > > > > > > > model changes, the same concept of quota can
> > still
> > > be
> > > > > > > > applied.
> > > > > > > > > > For
> > > > > > > > > > > > now,
> > > > > > > > > > > > > > > since it's a bit tricky to add the delay logic
> in
> > > the
> > > > > > > network
> > > > > > > > > > > thread
> > > > > > > > > > > > > > pool,
> > > > > > > > > > > > > > > we could probably just do the delaying only in
> > the
> > > io
> > > > > > > threads
> > > > > > > > > as
> > > > > > > > > > > you
> > > > > > > > > > > > > > > suggested earlier.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > There is still the orthogonal question of
> > whether a
> > > > > quota
> > > > > > > of
> > > > > > > > > 50%
> > > > > > > > > > is
> > > > > > > > > > > > out
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > 100% or 100% * #total processing threads. My
> > > feeling
> > > > is
> > > > > > > that
> > > > > > > > > the
> > > > > > > > > > > > latter
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > slightly better based on my explanation
> earlier.
> > > The
> > > > > way
> > > > > > to
> > > > > > > > > > > describe
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > quota to the users can be "share of elapsed
> > request
> > > > > > > > processing
> > > > > > > > > > time
> > > > > > > > > > > > on
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > single CPU" (similar to top).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram
> <
> > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Agree about the two scenarios.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > But still not sure about a single quota
> > covering
> > > > both
> > > > > > > > network
> > > > > > > > > > > > threads
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > I/O threads with per-thread quota. If there
> are
> > > 10
> > > > > I/O
> > > > > > > > > threads
> > > > > > > > > > > and
> > > > > > > > > > > > 5
> > > > > > > > > > > > > > > > network threads and I want to assign half the
> > > quota
> > > > > to
> > > > > > > > userA,
> > > > > > > > > > the
> > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > would be 750%. I imagine, internally, we
> would
> > > > > convert
> > > > > > > this
> > > > > > > > > to
> > > > > > > > > > > 500%
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > and 250% for network threads to allocate 50%
> of
> > > > each
> > > > > > > pool.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > A couple of scenarios:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 1. Admin adds 1 extra network thread. To
> retain
> > > > 50%,
> > > > > > > admin
> > > > > > > > > > needs
> > > > > > > > > > > to
> > > > > > > > > > > > > now
> > > > > > > > > > > > > > > > allocate 800% for each user. Or increase the
> > > quota
> > > > > for
> > > > > > a
> > > > > > > > few
> > > > > > > > > > > users.
> > > > > > > > > > > > > To
> > > > > > > > > > > > > > > me,
> > > > > > > > > > > > > > > > it feels like admin needs to convert 50% to
> > 800%
> > > > and
> > > > > > > Kafka
> > > > > > > > > > > > internally
> > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > to convert 800% to (500%, 300%). Everyone
> using
> > > > just
> > > > > > 50%
> > > > > > > > > feels
> > > > > > > > > > a
> > > > > > > > > > > > lot
> > > > > > > > > > > > > > > > simpler.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2. We decide to add some other thread to this
> > > list.
> > > > > > Admin
> > > > > > > > > needs
> > > > > > > > > > > to
> > > > > > > > > > > > > know
> > > > > > > > > > > > > > > > exactly how many threads form the maximum
> > quota.
> > > > And
> > > > > we
> > > > > > > can
> > > > > > > > > be
> > > > > > > > > > > > > changing
> > > > > > > > > > > > > > > > this between broker versions as we add more
> to
> > > the
> > > > > > list.
> > > > > > > > > Again
> > > > > > > > > > a
> > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > overall percent would be a lot simpler.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > There were others who were unconvinced by a
> > > single
> > > > > > > percent
> > > > > > > > > from
> > > > > > > > > > > the
> > > > > > > > > > > > > > > initial
> > > > > > > > > > > > > > > > proposal and were happier with thread units
> > > similar
> > > > > to
> > > > > > > CPU
> > > > > > > > > > units,
> > > > > > > > > > > > so
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > am
> > > > > > > > > > > > > > > > ok with going with per-thread quotas (as
> units
> > or
> > > > > > > percent).
> > > > > > > > > > Just
> > > > > > > > > > > > not
> > > > > > > > > > > > > > sure
> > > > > > > > > > > > > > > > it makes it easier for admin in all cases.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <
> > > > > > > jun@confluent.io>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Consider modeling as n * 100% unit. For 2),
> > the
> > > > > > > question
> > > > > > > > is
> > > > > > > > > > > > what's
> > > > > > > > > > > > > > > > causing
> > > > > > > > > > > > > > > > > the I/O threads to be saturated. It's
> > unlikely
> > > > that
> > > > > > all
> > > > > > > > > > users'
> > > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > have increased at the same. A more likely
> > case
> > > is
> > > > > > that
> > > > > > > a
> > > > > > > > > few
> > > > > > > > > > > > > isolated
> > > > > > > > > > > > > > > > > users' utilization have increased. If so,
> > after
> > > > > > > > increasing
> > > > > > > > > > the
> > > > > > > > > > > > > number
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > threads, the admin just needs to adjust the
> > > quota
> > > > > > for a
> > > > > > > > few
> > > > > > > > > > > > > isolated
> > > > > > > > > > > > > > > > users,
> > > > > > > > > > > > > > > > > which is expected and is less work.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Consider modeling as 1 * 100% unit. For 1),
> > all
> > > > > > users'
> > > > > > > > > quota
> > > > > > > > > > > need
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > adjusted, which is unexpected and is more
> > work.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > So, to me, the n * 100% model seems more
> > > > > convenient.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > As for future extension to cover network
> > thread
> > > > > > > > > utilization,
> > > > > > > > > > I
> > > > > > > > > > > > was
> > > > > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > > > > that one way is to simply model the
> capacity
> > as
> > > > (n
> > > > > +
> > > > > > > m) *
> > > > > > > > > > 100%
> > > > > > > > > > > > > unit,
> > > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > > n and m are the number of network and i/o
> > > > threads,
> > > > > > > > > > > respectively.
> > > > > > > > > > > > > > Then,
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > each user, we can just add up the
> utilization
> > > in
> > > > > the
> > > > > > > > > network
> > > > > > > > > > > and
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > i/o
> > > > > > > > > > > > > > > > > thread. If we do this, we don't need a new
> > type
> > > > of
> > > > > > > quota.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini
> > > Sivaram <
> > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > If we use request.percentage as the
> > > percentage
> > > > > used
> > > > > > > in
> > > > > > > > a
> > > > > > > > > > > single
> > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > > the total percentage being allocated will
> > be
> > > > > > > > > > num.io.threads *
> > > > > > > > > > > > 100
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > threads and num.network.threads * 100 for
> > > > network
> > > > > > > > > threads.
> > > > > > > > > > A
> > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > covering the two as a percentage wouldn't
> > > quite
> > > > > > work
> > > > > > > if
> > > > > > > > > you
> > > > > > > > > > > > want
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > allocate the same proportion in both
> cases.
> > > If
> > > > we
> > > > > > > want
> > > > > > > > to
> > > > > > > > > > > treat
> > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > separate units, won't we need two quota
> > > > > > > configurations
> > > > > > > > > > > > regardless
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > whether we use units or percentage?
> > Perhaps I
> > > > > > > > > misunderstood
> > > > > > > > > > > > your
> > > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I think there are two cases:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >    1. The use case that you mentioned
> where
> > > an
> > > > > > admin
> > > > > > > is
> > > > > > > > > > > adding
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > >    and decides to add more I/O threads
> and
> > > > > expects
> > > > > > to
> > > > > > > > > find
> > > > > > > > > > > free
> > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >    allocate for new users.
> > > > > > > > > > > > > > > > > >    2. Admin adds more I/O threads because
> > the
> > > > I/O
> > > > > > > > threads
> > > > > > > > > > are
> > > > > > > > > > > > > > > saturated
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > >    there are cores available to allocate,
> > > even
> > > > > > though
> > > > > > > > the
> > > > > > > > > > > > number
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > If we allocated treated I/O threads as a
> > > single
> > > > > > unit
> > > > > > > of
> > > > > > > > > > 100%,
> > > > > > > > > > > > all
> > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > > quotas need to be reallocated for 1). If
> we
> > > > > > allocated
> > > > > > > > I/O
> > > > > > > > > > > > threads
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > n
> > > > > > > > > > > > > > > > > > units with n*100%, all user quotas need
> to
> > be
> > > > > > > > reallocated
> > > > > > > > > > for
> > > > > > > > > > > > 2),
> > > > > > > > > > > > > > > > > otherwise
> > > > > > > > > > > > > > > > > > some of the new threads may just not be
> > used.
> > > > > > Either
> > > > > > > > way
> > > > > > > > > it
> > > > > > > > > > > > > should
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > easy
> > > > > > > > > > > > > > > > > > to write a script to decrease/increase
> > quotas
> > > > by
> > > > > a
> > > > > > > > > multiple
> > > > > > > > > > > for
> > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > So it really boils down to which quota
> unit
> > > is
> > > > > most
> > > > > > > > > > intuitive
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > terms
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > configuration. And from the discussion so
> > > far,
> > > > it
> > > > > > > feels
> > > > > > > > > > like
> > > > > > > > > > > > > > opinion
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > divided on whether quotas should be
> carved
> > > out
> > > > of
> > > > > > an
> > > > > > > > > > absolute
> > > > > > > > > > > > > 100%
> > > > > > > > > > > > > > > (or
> > > > > > > > > > > > > > > > 1
> > > > > > > > > > > > > > > > > > unit) or be relative to the number of
> > threads
> > > > > > (n*100%
> > > > > > > > or
> > > > > > > > > n
> > > > > > > > > > > > > units).
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <
> > > > > > > > > jun@confluent.io>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Another way to express an absolute
> limit
> > is
> > > > to
> > > > > > use
> > > > > > > > > > > > > > > > request.percentage,
> > > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > treat it as the percentage used in a
> > single
> > > > > > request
> > > > > > > > > > > handling
> > > > > > > > > > > > > > > thread.
> > > > > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > > > > now, the request handling threads can
> be
> > > just
> > > > > the
> > > > > > > io
> > > > > > > > > > > threads.
> > > > > > > > > > > > > In
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > future, they can cover the network
> > threads
> > > as
> > > > > > well.
> > > > > > > > > This
> > > > > > > > > > is
> > > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > top reports CPU usage and may be a bit
> > > easier
> > > > > for
> > > > > > > > > people
> > > > > > > > > > to
> > > > > > > > > > > > > > > > understand.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun
> > Rao <
> > > > > > > > > > > jun@confluent.io>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > 2. Regarding request.unit vs
> > > > > > request.percentage.
> > > > > > > I
> > > > > > > > > > > started
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > > request.percentage too. The reasoning
> > for
> > > > > > > > > request.unit
> > > > > > > > > > is
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > following.
> > > > > > > > > > > > > > > > > > > > Suppose that the capacity has been
> > > reached
> > > > > on a
> > > > > > > > > broker
> > > > > > > > > > > and
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > > > > to add a new user. A simple way to
> > > increase
> > > > > the
> > > > > > > > > > capacity
> > > > > > > > > > > is
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > increase
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > > number of io threads, assuming there
> > are
> > > > > still
> > > > > > > > enough
> > > > > > > > > > > > cores.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > > > is based on percentage, the
> additional
> > > > > capacity
> > > > > > > > > > > > automatically
> > > > > > > > > > > > > > > gets
> > > > > > > > > > > > > > > > > > > > distributed to existing users and we
> > > > haven't
> > > > > > > really
> > > > > > > > > > > carved
> > > > > > > > > > > > > out
> > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > > additional resource for the new user.
> > > Now,
> > > > is
> > > > > > it
> > > > > > > > easy
> > > > > > > > > > > for a
> > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > reason
> > > > > > > > > > > > > > > > > > > > about 0.1 unit vs 10%. My feeling is
> > that
> > > > > both
> > > > > > > are
> > > > > > > > > hard
> > > > > > > > > > > and
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > > configured empirically. Not sure if
> > > > > percentage
> > > > > > is
> > > > > > > > > > > obviously
> > > > > > > > > > > > > > > easier
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > > reason about.
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay
> > > Kreps
> > > > <
> > > > > > > > > > > > jay@confluent.io
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> 1. Even though the implementation of
> > > this
> > > > > > quota
> > > > > > > is
> > > > > > > > > > only
> > > > > > > > > > > > > using
> > > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > >> time, i think we should call it
> > > something
> > > > > like
> > > > > > > > > > > > > "request-time".
> > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > >> give us flexibility to improve the
> > > > > > > implementation
> > > > > > > > to
> > > > > > > > > > > cover
> > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > >> in the future and will avoid
> exposing
> > > > > internal
> > > > > > > > > details
> > > > > > > > > > > > like
> > > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> 2. Jun/Roger, I get what you are
> > trying
> > > to
> > > > > fix
> > > > > > > but
> > > > > > > > > the
> > > > > > > > > > > > idea
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> thread/units
> > > > > > > > > > > > > > > > > > > >> is super unintuitive as a
> user-facing
> > > > knob.
> > > > > I
> > > > > > > had
> > > > > > > > to
> > > > > > > > > > > read
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > >> eight times to understand this. I'm
> > not
> > > > sure
> > > > > > > that
> > > > > > > > > your
> > > > > > > > > > > > point
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > >> increasing the number of threads is
> a
> > > > > problem
> > > > > > > > with a
> > > > > > > > > > > > > > > > > percentage-based
> > > > > > > > > > > > > > > > > > > >> value, it really depends on whether
> > the
> > > > user
> > > > > > > > thinks
> > > > > > > > > > > about
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > "percentage
> > > > > > > > > > > > > > > > > > > >> of request processing time" or
> "thread
> > > > > units".
> > > > > > > If
> > > > > > > > > they
> > > > > > > > > > > > think
> > > > > > > > > > > > > > "I
> > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > >> allocated 10% of my request
> processing
> > > > time
> > > > > to
> > > > > > > > user
> > > > > > > > > x"
> > > > > > > > > > > > then
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > bug
> > > > > > > > > > > > > > > > > > > >> that increasing the thread count
> > > decreases
> > > > > > that
> > > > > > > > > > percent
> > > > > > > > > > > as
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> current proposal. As a practical
> > matter
> > > I
> > > > > > think
> > > > > > > > the
> > > > > > > > > > only
> > > > > > > > > > > > way
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > actually
> > > > > > > > > > > > > > > > > > > >> reason about this is as a
> percent---I
> > > just
> > > > > > don't
> > > > > > > > > > believe
> > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > going
> > > > > > > > > > > > > > > > > > > >> to think, "ah, 4.3 thread units,
> that
> > is
> > > > the
> > > > > > > right
> > > > > > > > > > > > amount!".
> > > > > > > > > > > > > > > > > Instead I
> > > > > > > > > > > > > > > > > > > >> think they have to understand this
> > > thread
> > > > > unit
> > > > > > > > > > concept,
> > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > >> they have set in number of threads,
> > > > compute
> > > > > a
> > > > > > > > > percent
> > > > > > > > > > > and
> > > > > > > > > > > > > then
> > > > > > > > > > > > > > > > come
> > > > > > > > > > > > > > > > > up
> > > > > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > > > > >> the number of thread units, and
> these
> > > will
> > > > > all
> > > > > > > be
> > > > > > > > > > wrong
> > > > > > > > > > > if
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > >> count changes. I also think this
> ties
> > us
> > > > to
> > > > > > > > > throttling
> > > > > > > > > > > the
> > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > >> pool,
> > > > > > > > > > > > > > > > > > > >> which may not be where we want to
> end
> > > up.
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> 3. For what it's worth I do think
> > > having a
> > > > > > > single
> > > > > > > > > > > > > throttle_ms
> > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > > >> the responses that combines all
> > > throttling
> > > > > > from
> > > > > > > > all
> > > > > > > > > > > quotas
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> simplest. There could be a use case
> > for
> > > > > having
> > > > > > > > > > separate
> > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > each,
> > > > > > > > > > > > > > > > > > > >> but I think that is actually harder
> to
> > > > > > > use/monitor
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > > common
> > > > > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > > >> unless someone has a use case I
> think
> > > just
> > > > > one
> > > > > > > > > should
> > > > > > > > > > be
> > > > > > > > > > > > > fine.
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM,
> > Rajini
> > > > > > Sivaram
> > > > > > > <
> > > > > > > > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > >> > I have updated the KIP based on
> the
> > > > > > > discussions
> > > > > > > > so
> > > > > > > > > > > far.
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > Regards,
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > Rajini
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM,
> > > Rajini
> > > > > > > > Sivaram <
> > > > > > > > > > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > Ismael #1. It makes sense not to
> > > > > throttle
> > > > > > > > > > > inter-broker
> > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > >> > > LeaderAndIsr etc. The simplest
> way
> > > to
> > > > > > ensure
> > > > > > > > > that
> > > > > > > > > > > > > clients
> > > > > > > > > > > > > > > > cannot
> > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > >> > these
> > > > > > > > > > > > > > > > > > > >> > > requests to bypass quotas for
> DoS
> > > > > attacks
> > > > > > is
> > > > > > > > to
> > > > > > > > > > > ensure
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > ACLs
> > > > > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > > > > >> > > clients from using these
> requests
> > > and
> > > > > > > > > unauthorized
> > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > >> included
> > > > > > > > > > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was
> thinking
> > > > that
> > > > > > > these
> > > > > > > > > > quotas
> > > > > > > > > > > > can
> > > > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > >> > separate
> > > > > > > > > > > > > > > > > > > >> > > throttle time, and all
> utilization
> > > > based
> > > > > > > > quotas
> > > > > > > > > > > could
> > > > > > > > > > > > > use
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > > > > >> > > (we won't add another one for
> > > network
> > > > > > thread
> > > > > > > > > > > > utilization
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> instance).
> > > > > > > > > > > > > > > > > > > >> > But
> > > > > > > > > > > > > > > > > > > >> > > perhaps it makes sense to keep
> > byte
> > > > rate
> > > > > > > > quotas
> > > > > > > > > > > > separate
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > > > > > > > > > >> > > responses to provide separate
> > > metrics?
> > > > > > Agree
> > > > > > > > > with
> > > > > > > > > > > > Ismael
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> name of
> > > > > > > > > > > > > > > > > > > >> > > the existing field should be
> > changed
> > > > if
> > > > > we
> > > > > > > > have
> > > > > > > > > > two.
> > > > > > > > > > > > > Happy
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > switch
> > > > > > > > > > > > > > > > > > > >> to a
> > > > > > > > > > > > > > > > > > > >> > > single combined throttle time if
> > > that
> > > > is
> > > > > > > > > > sufficient.
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > Ismael #4, #5, #6: Will update
> > KIP.
> > > > Will
> > > > > > use
> > > > > > > > dot
> > > > > > > > > > > > > separated
> > > > > > > > > > > > > > > > name
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> new
> > > > > > > > > > > > > > > > > > > >> > > property. Replication quotas use
> > dot
> > > > > > > > separated,
> > > > > > > > > so
> > > > > > > > > > > it
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> consistent
> > > > > > > > > > > > > > > > > > > >> > > with all properties except byte
> > rate
> > > > > > quotas.
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > Radai: #1 Request processing
> time
> > > > rather
> > > > > > > than
> > > > > > > > > > > request
> > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > > > > chosen
> > > > > > > > > > > > > > > > > > > >> > > because the time per request can
> > > vary
> > > > > > > > > > significantly
> > > > > > > > > > > > > > between
> > > > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > > >> > > mentioned in the discussion and
> > KIP.
> > > > > > > > > > > > > > > > > > > >> > > #2 Two separate quotas for
> > > > > > > heartbeats/regular
> > > > > > > > > > > requests
> > > > > > > > > > > > > > feel
> > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > >> > > configuration and more metrics.
> > > Since
> > > > > most
> > > > > > > > users
> > > > > > > > > > > would
> > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > > > >> higher
> > > > > > > > > > > > > > > > > > > >> > > than the expected usage and
> quotas
> > > are
> > > > > > more
> > > > > > > > of a
> > > > > > > > > > > > safety
> > > > > > > > > > > > > > > net, a
> > > > > > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > > > > > > > > > >> > >  #3 The number of requests in
> > > > purgatory
> > > > > is
> > > > > > > > > limited
> > > > > > > > > > > by
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > number
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> > active
> > > > > > > > > > > > > > > > > > > >> > > connections since only one
> request
> > > per
> > > > > > > > > connection
> > > > > > > > > > > will
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > throttled
> > > > > > > > > > > > > > > > > > > >> at a
> > > > > > > > > > > > > > > > > > > >> > > time.
> > > > > > > > > > > > > > > > > > > >> > > #4 As with byte rate quotas, to
> > use
> > > > the
> > > > > > full
> > > > > > > > > > > allocated
> > > > > > > > > > > > > > > quotas,
> > > > > > > > > > > > > > > > > > > >> > > clients/users would need to use
> > > > > partitions
> > > > > > > > that
> > > > > > > > > > are
> > > > > > > > > > > > > > > > distributed
> > > > > > > > > > > > > > > > > > > across
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > > cluster. The alternative of
> using
> > > > > > > cluster-wide
> > > > > > > > > > > quotas
> > > > > > > > > > > > > > > instead
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> > per-broker
> > > > > > > > > > > > > > > > > > > >> > > quotas would be far too complex
> to
> > > > > > > implement.
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > Dong : We currently have two
> > > > > > > > ClientQuotaManagers
> > > > > > > > > > for
> > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > types
> > > > > > > > > > > > > > > > > > > Fetch
> > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > >> > > Produce. A new one will be added
> > for
> > > > > > > IOThread,
> > > > > > > > > > which
> > > > > > > > > > > > > > manages
> > > > > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > > > > >> > I/O
> > > > > > > > > > > > > > > > > > > >> > > thread utilization. This will
> not
> > > > update
> > > > > > the
> > > > > > > > > Fetch
> > > > > > > > > > > or
> > > > > > > > > > > > > > > Produce
> > > > > > > > > > > > > > > > > > > >> queue-size,
> > > > > > > > > > > > > > > > > > > >> > > but will have a separate metric
> > for
> > > > the
> > > > > > > > > > > queue-size.  I
> > > > > > > > > > > > > > > wasn't
> > > > > > > > > > > > > > > > > > > >> planning to
> > > > > > > > > > > > > > > > > > > >> > > add any additional metrics apart
> > > from
> > > > > the
> > > > > > > > > > equivalent
> > > > > > > > > > > > > ones
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > >> > > quotas as part of this KIP.
> Ratio
> > of
> > > > > > > byte-rate
> > > > > > > > > to
> > > > > > > > > > > I/O
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > >> utilization
> > > > > > > > > > > > > > > > > > > >> > > could be slightly misleading
> since
> > > it
> > > > > > > depends
> > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > > > > > > sequence
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> > requests.
> > > > > > > > > > > > > > > > > > > >> > > But we can look into more
> metrics
> > > > after
> > > > > > the
> > > > > > > > KIP
> > > > > > > > > is
> > > > > > > > > > > > > > > implemented
> > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > >> > required.
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > I think we need to limit the
> > maximum
> > > > > delay
> > > > > > > > since
> > > > > > > > > > all
> > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > >> > > throttled. If a client has a
> quota
> > > of
> > > > > > 0.001
> > > > > > > > > units
> > > > > > > > > > > and
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > >> > used
> > > > > > > > > > > > > > > > > > > >> > > 50ms, we don't want to delay all
> > > > > requests
> > > > > > > from
> > > > > > > > > the
> > > > > > > > > > > > > client
> > > > > > > > > > > > > > by
> > > > > > > > > > > > > > > > 50
> > > > > > > > > > > > > > > > > > > >> seconds,
> > > > > > > > > > > > > > > > > > > >> > > throwing the client out of all
> its
> > > > > > consumer
> > > > > > > > > > groups.
> > > > > > > > > > > > The
> > > > > > > > > > > > > > > issue
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > >> if
> > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > >> > > user is allocated a quota that
> is
> > > > > > > insufficient
> > > > > > > > > to
> > > > > > > > > > > > > process
> > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > large
> > > > > > > > > > > > > > > > > > > >> > > request. The expectation is that
> > the
> > > > > units
> > > > > > > > > > allocated
> > > > > > > > > > > > per
> > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> > much
> > > > > > > > > > > > > > > > > > > >> > > higher than the time taken to
> > > process
> > > > > one
> > > > > > > > > request
> > > > > > > > > > > and
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > > >> > > seldom be applied. Agree this
> > needs
> > > > > proper
> > > > > > > > > > > > > documentation.
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM,
> > > > radai <
> > > > > > > > > > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> @jun: i wasnt concerned about
> > tying
> > > > up
> > > > > a
> > > > > > > > > request
> > > > > > > > > > > > > > processing
> > > > > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > > > >> but
> > > > > > > > > > > > > > > > > > > >> > >> IIUC the code does still read
> the
> > > > > entire
> > > > > > > > > request
> > > > > > > > > > > out,
> > > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > > might
> > > > > > > > > > > > > > > > > > > >> add-up
> > > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > > >> > >> a non-negligible amount of
> > memory.
> > > > > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55
> AM,
> > > > Dong
> > > > > > Lin
> > > > > > > <
> > > > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > The current KIP says that the
> > > > maximum
> > > > > > > delay
> > > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > > > > reduced
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> window
> > > > > > > > > > > > > > > > > > > >> > >> size
> > > > > > > > > > > > > > > > > > > >> > >> > if it is larger than the
> window
> > > > > size. I
> > > > > > > > have
> > > > > > > > > a
> > > > > > > > > > > > > concern
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > this:
> > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > 1) This essentially means
> that
> > > the
> > > > > user
> > > > > > > is
> > > > > > > > > > > allowed
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > exceed
> > > > > > > > > > > > > > > > > > > their
> > > > > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > > > > >> > >> > over a long period of time.
> Can
> > > you
> > > > > > > provide
> > > > > > > > > an
> > > > > > > > > > > > upper
> > > > > > > > > > > > > > > bound
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > 2) What is the motivation for
> > cap
> > > > the
> > > > > > > > maximum
> > > > > > > > > > > delay
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > window
> > > > > > > > > > > > > > > > > > > >> > size?
> > > > > > > > > > > > > > > > > > > >> > >> I
> > > > > > > > > > > > > > > > > > > >> > >> > am wondering if there is
> better
> > > > > > > alternative
> > > > > > > > > to
> > > > > > > > > > > > > address
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > 3) It means that the existing
> > > > > > > > metric-related
> > > > > > > > > > > config
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > > > > >> > >> > directly impact on the
> > mechanism
> > > of
> > > > > > this
> > > > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > > > > >> quota.
> > > > > > > > > > > > > > > > > > > >> > The
> > > > > > > > > > > > > > > > > > > >> > >> > may be an important change
> > > > depending
> > > > > on
> > > > > > > the
> > > > > > > > > > > answer
> > > > > > > > > > > > to
> > > > > > > > > > > > > > 1)
> > > > > > > > > > > > > > > > > above.
> > > > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > > >> > >> probably
> > > > > > > > > > > > > > > > > > > >> > >> > need to document this more
> > > > > explicitly.
> > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56
> > AM,
> > > > > Dong
> > > > > > > Lin
> > > > > > > > <
> > > > > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > > Yeah you are right. I
> thought
> > > it
> > > > > > wasn't
> > > > > > > > > > because
> > > > > > > > > > > > at
> > > > > > > > > > > > > > > > LinkedIn
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > >> will
> > > > > > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > > > > >> > >> > > much pressure on inGraph to
> > > > expose
> > > > > > > those
> > > > > > > > > > > > > per-clientId
> > > > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > > > >> > >> ended
> > > > > > > > > > > > > > > > > > > >> > >> > > up printing them
> periodically
> > > to
> > > > > > local
> > > > > > > > log.
> > > > > > > > > > > Never
> > > > > > > > > > > > > > mind
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> not
> > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > > - I agree with Jay that we
> > > > probably
> > > > > > > don't
> > > > > > > > > > want
> > > > > > > > > > > to
> > > > > > > > > > > > > > add a
> > > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > > > > > > > >> > >> > > every quota ProduceResponse
> > or
> > > > > > > > > FetchResponse.
> > > > > > > > > > > Is
> > > > > > > > > > > > > > there
> > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > >> use-case
> > > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > > >> > >> > > having separate
> throttle-time
> > > > > fields
> > > > > > > for
> > > > > > > > > > > > > > > byte-rate-quota
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-quota? You
> > > > probably
> > > > > > need
> > > > > > > > to
> > > > > > > > > > > > document
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > >> > interface
> > > > > > > > > > > > > > > > > > > >> > >> > > change if you plan to add
> new
> > > > field
> > > > > > in
> > > > > > > > any
> > > > > > > > > > > > request.
> > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > > - I don't think IOThread
> > > belongs
> > > > to
> > > > > > > > > > quotaType.
> > > > > > > > > > > > The
> > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > > >> > >> types
> > > > > > > > > > > > > > > > > > > >> > >> > > (i.e.
> > > > > Produce/Fetch/LeaderReplicatio
> > > > > > > > > > > > > > > > n/FollowerReplication)
> > > > > > > > > > > > > > > > > > > >> identify
> > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > > type of request that are
> > > > throttled,
> > > > > > not
> > > > > > > > the
> > > > > > > > > > > quota
> > > > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > > - If a request is throttled
> > due
> > > > to
> > > > > > this
> > > > > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > > > > >> quota,
> > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > > existing queue-size metric
> in
> > > > > > > > > > > ClientQuotaManager
> > > > > > > > > > > > > > > > > incremented?
> > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > > - In the interest of
> > providing
> > > > > guide
> > > > > > > line
> > > > > > > > > for
> > > > > > > > > > > > admin
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > decide
> > > > > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-based quota
> > and
> > > > for
> > > > > > user
> > > > > > > > to
> > > > > > > > > > > > > understand
> > > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > > > impact
> > > > > > > > > > > > > > > > > > > >> on
> > > > > > > > > > > > > > > > > > > >> > >> their
> > > > > > > > > > > > > > > > > > > >> > >> > > traffic, would it be useful
> > to
> > > > > have a
> > > > > > > > > metric
> > > > > > > > > > > that
> > > > > > > > > > > > > > shows
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> overall
> > > > > > > > > > > > > > > > > > > >> > >> > > byte-rate per
> io-thread-unit?
> > > Can
> > > > > we
> > > > > > > also
> > > > > > > > > > show
> > > > > > > > > > > > > this a
> > > > > > > > > > > > > > > > > > > >> per-clientId
> > > > > > > > > > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at
> 9:25
> > > AM,
> > > > > Jun
> > > > > > > Rao
> > > > > > > > <
> > > > > > > > > > > > > > > > jun@confluent.io
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> > >> For #3, typically, an
> admin
> > > > won't
> > > > > > > > > configure
> > > > > > > > > > > more
> > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > than
> > > > > > > > > > > > > > > > > > > >> > CPU
> > > > > > > > > > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > > > > > > > > > >> > >> > >> but it's possible for an
> > admin
> > > > to
> > > > > > > start
> > > > > > > > > with
> > > > > > > > > > > > fewer
> > > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > > > > > > >> > >> cores
> > > > > > > > > > > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> > >> I think the throttleTime
> > > sensor
> > > > on
> > > > > > the
> > > > > > > > > > broker
> > > > > > > > > > > > > tells
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > > > > > > > > > >> > >> > >> user/clentId is throttled
> or
> > > > not.
> > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> > >> The reasoning for delaying
> > the
> > > > > > > throttled
> > > > > > > > > > > > requests
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > > >> > >> instead
> > > > > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> returning an error
> > immediately
> > > > is
> > > > > > that
> > > > > > > > the
> > > > > > > > > > > > latter
> > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > >> client from retrying
> > > > immediately,
> > > > > > > which
> > > > > > > > > will
> > > > > > > > > > > > make
> > > > > > > > > > > > > > > things
> > > > > > > > > > > > > > > > > > > worse.
> > > > > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > > > > >> > >> > >> delaying logic is based
> off
> > a
> > > > > delay
> > > > > > > > > queue. A
> > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > > expiration
> > > > > > > > > > > > > > > > > > > >> > >> thread
> > > > > > > > > > > > > > > > > > > >> > >> > >> just waits on the next to
> be
> > > > > expired
> > > > > > > > > > request.
> > > > > > > > > > > > So,
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > > > tie
> > > > > > > > > > > > > > > > > > > >> > up a
> > > > > > > > > > > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at
> 9:07
> > > AM,
> > > > > > > Ismael
> > > > > > > > > > Juma <
> > > > > > > > > > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > Regarding 1, I
> definitely
> > > like
> > > > > the
> > > > > > > > > > > simplicity
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > keeping a
> > > > > > > > > > > > > > > > > > > >> single
> > > > > > > > > > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > > > > > > > > > >> > >> > >> > time field in the
> > response.
> > > > The
> > > > > > > > downside
> > > > > > > > > > is
> > > > > > > > > > > > that
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > > > > >> > metrics
> > > > > > > > > > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > > > > > > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > > > > > > > > > percentage`
> > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > `log.cleaner.min.cleanable.
> > > > > > ratio`.
> > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at
> > 4:43
> > > > PM,
> > > > > > Jay
> > > > > > > > > > Kreps <
> > > > > > > > > > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    1. Isn't it the
> case
> > > that
> > > > > the
> > > > > > > > > > > throttling
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > > > > >> > >> > should
> > > > > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    the total time your
> > > > request
> > > > > > was
> > > > > > > > > > > throttled
> > > > > > > > > > > > > > > > > > irrespective
> > > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    caused that.
> Limiting
> > > it
> > > > to
> > > > > > > byte
> > > > > > > > > rate
> > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    I don't think we
> want
> > > to
> > > > > end
> > > > > > up
> > > > > > > > > > adding
> > > > > > > > > > > > new
> > > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> response
> > > > > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    single thing we
> > quota,
> > > > > right?
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    2. I don't think we
> > > > should
> > > > > > make
> > > > > > > > > this
> > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > specifically
> > > > > > > > > > > > > > > > > > > >> > about
> > > > > > > > > > > > > > > > > > > >> > >> io
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    threads. Once we
> > > > introduce
> > > > > > > these
> > > > > > > > > > quotas
> > > > > > > > > > > > > > people
> > > > > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    be enforced (and if
> > > they
> > > > > > aren't
> > > > > > > > it
> > > > > > > > > > may
> > > > > > > > > > > > > cause
> > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > outage).
> > > > > > > > > > > > > > > > > > > >> As
> > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    are a bit more
> > > sensitive
> > > > > than
> > > > > > > > > normal
> > > > > > > > > > > > > > configs, I
> > > > > > > > > > > > > > > > > > think.
> > > > > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > > > > >> > >> > current
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    pools seem like
> > > something
> > > > > of
> > > > > > an
> > > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > > detail
> > > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    user-facing quotas
> > > should
> > > > > be
> > > > > > > > > involved
> > > > > > > > > > > > > with. I
> > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > >> might
> > > > > > > > > > > > > > > > > > > >> > >> be
> > > > > > > > > > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    make this a general
> > > > > > > request-time
> > > > > > > > > > > throttle
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > > >> mention in
> > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    about I/O threads
> and
> > > > > simply
> > > > > > > > > > > acknowledge
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    may someday fix) in
> > the
> > > > > docs
> > > > > > > that
> > > > > > > > > > this
> > > > > > > > > > > > > covers
> > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > > > >> > >> after
> > > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    thread is read off
> > the
> > > > > > network.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    3. As such I think
> > the
> > > > > right
> > > > > > > > > > interface
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    like
> > > percent_request_time
> > > > > and
> > > > > > > be
> > > > > > > > in
> > > > > > > > > > > > > > {0,...100}
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I
> > > think
> > > > > > > "ratio"
> > > > > > > > > is
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > terminology
> > > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > >> used
> > > > > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >    is between 0 and 1
> in
> > > the
> > > > > > other
> > > > > > > > > > > metrics,
> > > > > > > > > > > > > > > right?)
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017
> at
> > > 3:45
> > > > > AM,
> > > > > > > > > Rajini
> > > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Thank you for the
> > > > feedback.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang : I have
> > > updated
> > > > > the
> > > > > > > > > section
> > > > > > > > > > on
> > > > > > > > > > > > > > > > > co-existence
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> byte
> > > > > > > > > > > > > > > > > > > >> > >> rate
> > > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added
> > > much
> > > > > > detail
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> sensors
> > > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > going to be very
> > similar
> > > > to
> > > > > > the
> > > > > > > > > > existing
> > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> sensors.
> > > > > > > > > > > > > > > > > > > >> > >> To
> > > > > > > > > > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > confusion, I have
> now
> > > > added
> > > > > > more
> > > > > > > > > > detail.
> > > > > > > > > > > > All
> > > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > "quotaType" and all
> > > > sensors
> > > > > > have
> > > > > > > > > names
> > > > > > > > > > > > > > starting
> > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > quotaType is
> > > > Produce/Fetch/
> > > > > > > > > > > > > LeaderReplication/
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > FollowerReplication/*IOThread*
> > > > > > > ).
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > So there will be no
> > > reuse
> > > > of
> > > > > > > > > existing
> > > > > > > > > > > > > > > > > metrics/sensors.
> > > > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > > >> > new
> > > > > > > > > > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > request processing
> > time
> > > > > based
> > > > > > > > > > throttling
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> completely
> > > > > > > > > > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > existing
> > > metrics/sensors,
> > > > > but
> > > > > > > will
> > > > > > > > > be
> > > > > > > > > > > > > > consistent
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> format.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > The existing
> > > > > throttle_time_ms
> > > > > > > > field
> > > > > > > > > in
> > > > > > > > > > > > > > > > produce/fetch
> > > > > > > > > > > > > > > > > > > >> > responses
> > > > > > > > > > > > > > > > > > > >> > >> > will
> > > > > > > > > > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > impacted by this
> KIP.
> > > That
> > > > > > will
> > > > > > > > > > continue
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > > > >> byte-rate
> > > > > > > > > > > > > > > > > > > >> > >> based
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > throttling times. In
> > > > > > addition, a
> > > > > > > > new
> > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > added to return
> > request
> > > > > quota
> > > > > > > > based
> > > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > > > times.
> > > > > > > > > > > > > > > > > > > >> These
> > > > > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > as new metrics on
> the
> > > > > > > client-side.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Since all metrics
> and
> > > > > sensors
> > > > > > > are
> > > > > > > > > > > > different
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > believe there is
> > already
> > > > > > > > sufficient
> > > > > > > > > > > > metrics
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > monitor
> > > > > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > client and broker
> side
> > > for
> > > > > > each
> > > > > > > > type
> > > > > > > > > > of
> > > > > > > > > > > > > > > > throttling.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017
> > at
> > > > 4:32
> > > > > > AM,
> > > > > > > > > Dong
> > > > > > > > > > > Lin
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > I think it makes a
> > lot
> > > > of
> > > > > > > sense
> > > > > > > > to
> > > > > > > > > > use
> > > > > > > > > > > > > > > > > > io_thread_units
> > > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > > >> > >> metric
> > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > user's traffic
> here.
> > > > LGTM
> > > > > > > > > overall. I
> > > > > > > > > > > > have
> > > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > > questions
> > > > > > > > > > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Can you be more
> > > > specific
> > > > > > in
> > > > > > > > the
> > > > > > > > > > KIP
> > > > > > > > > > > > what
> > > > > > > > > > > > > > > > sensors
> > > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > example, it will
> be
> > > > useful
> > > > > > to
> > > > > > > > > > specify
> > > > > > > > > > > > the
> > > > > > > > > > > > > > name
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - We currently
> have
> > > > > > > > throttle-time
> > > > > > > > > > and
> > > > > > > > > > > > > > > queue-size
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Are you going to
> > have
> > > > > > separate
> > > > > > > > > > > > > throttle-time
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> queue-size
> > > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > throttled by
> > > > > > > > io_thread_unit-based
> > > > > > > > > > > quota,
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > > >> share
> > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Does the
> > > throttle-time
> > > > > in
> > > > > > > the
> > > > > > > > > > > > > > > ProduceResponse
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > time due to
> > > > > > > io_thread_unit-based
> > > > > > > > > > > quota?
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Currently kafka
> > > server
> > > > > > > doesn't
> > > > > > > > > not
> > > > > > > > > > > > > provide
> > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > log
> > > > > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > whether any given
> > > > clientId
> > > > > > (or
> > > > > > > > > user)
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > throttled.
> > > > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > because we can
> still
> > > > check
> > > > > > the
> > > > > > > > > > > > client-side
> > > > > > > > > > > > > > > > > byte-rate
> > > > > > > > > > > > > > > > > > > >> metric
> > > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > whether a given
> > client
> > > > is
> > > > > > > > > throttled.
> > > > > > > > > > > But
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > will be no way to
> > > > validate
> > > > > > > > > whether a
> > > > > > > > > > > > given
> > > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> slow
> > > > > > > > > > > > > > > > > > > >> > >> > because
> > > > > > > > > > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > exceeded its
> > > > > io_thread_unit
> > > > > > > > limit.
> > > > > > > > > > It
> > > > > > > > > > > is
> > > > > > > > > > > > > > > > necessary
> > > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> user
> > > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > know this
> > information
> > > to
> > > > > > > figure
> > > > > > > > > how
> > > > > > > > > > > > > whether
> > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > >> > reached
> > > > > > > > > > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > limit. How about
> we
> > > add
> > > > > > log4j
> > > > > > > > log
> > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > > > > > server
> > > > > > > > > > > > > > > > > side
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > > > > > > > > > > byte-rate-throttle-time,
> > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > io-thread-unit-throttle-time)
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > that kafka
> > > administrator
> > > > > can
> > > > > > > > > figure
> > > > > > > > > > > > those
> > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > >> have
> > > > > > > > > > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > limit and act
> > > > accordingly?
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22,
> 2017
> > > at
> > > > > 4:46
> > > > > > > PM,
> > > > > > > > > > > > Guozhang
> > > > > > > > > > > > > > > Wang <
> > > > > > > > > > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Made a pass over
> > the
> > > > > doc,
> > > > > > > > > overall
> > > > > > > > > > > LGTM
> > > > > > > > > > > > > > > except
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > minor
> > > > > > > > > > > > > > > > > > > >> > >> comment
> > > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttling
> > > > > implementation:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Stated as
> "Request
> > > > > > > processing
> > > > > > > > > time
> > > > > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > necessary." I
> > > thought
> > > > > that
> > > > > > > it
> > > > > > > > > > meant
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > >> > processing
> > > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > is applied
> first,
> > > but
> > > > > > > continue
> > > > > > > > > > > > reading I
> > > > > > > > > > > > > > > found
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > >> > actually
> > > > > > > > > > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > produce / fetch
> > byte
> > > > > rate
> > > > > > > > > > throttling
> > > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Also the last
> > > sentence
> > > > > > "The
> > > > > > > > > > > remaining
> > > > > > > > > > > > > > delay
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > response." is a
> > bit
> > > > > > > confusing
> > > > > > > > to
> > > > > > > > > > me.
> > > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > > rewording
> > > > > > > > > > > > > > > > > > > >> it a
> > > > > > > > > > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22,
> > 2017
> > > > at
> > > > > > 3:24
> > > > > > > > PM,
> > > > > > > > > > Jun
> > > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Thanks for the
> > > > updated
> > > > > > > KIP.
> > > > > > > > > The
> > > > > > > > > > > > latest
> > > > > > > > > > > > > > > > > proposal
> > > > > > > > > > > > > > > > > > > >> looks
> > > > > > > > > > > > > > > > > > > >> > >> good
> > > > > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb
> 22,
> > > 2017
> > > > > at
> > > > > > > 2:19
> > > > > > > > > PM,
> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Thank you
> for
> > > the
> > > > > > > > feedback.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 1. I have
> > > updated
> > > > > the
> > > > > > > KIP
> > > > > > > > to
> > > > > > > > > > use
> > > > > > > > > > > > > > > absolute
> > > > > > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property is
> > > > called*
> > > > > > > > > > > > io_thread_units*
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > align
> > > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> > > > > > > *num.io.threads*.
> > > > > > > > > > When
> > > > > > > > > > > we
> > > > > > > > > > > > > > > > implement
> > > > > > > > > > > > > > > > > > > >> network
> > > > > > > > > > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas, we
> can
> > > add
> > > > > > > another
> > > > > > > > > > > > property
> > > > > > > > > > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 2.
> > > > > ControlledShutdown
> > > > > > is
> > > > > > > > > > already
> > > > > > > > > > > > > > listed
> > > > > > > > > > > > > > > > > under
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > you mean a
> > > > different
> > > > > > > > request
> > > > > > > > > > > that
> > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> added?
> > > > > > > > > > > > > > > > > > > >> > >> The
> > > > > > > > > > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > currently
> > exempt
> > > > in
> > > > > > the
> > > > > > > > KIP
> > > > > > > > > > are
> > > > > > > > > > > > > > > > StopReplica,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr
> > and
> > > > > > > > > > UpdateMetadata.
> > > > > > > > > > > > > These
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > >> controlled
> > > > > > > > > > > > > > > > > > > >> > >> > using
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it
> is
> > > easy
> > > > > to
> > > > > > > > > exclude
> > > > > > > > > > > and
> > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > sure if
> there
> > > are
> > > > > > other
> > > > > > > > > > requests
> > > > > > > > > > > > > used
> > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was
> > > thinking
> > > > > the
> > > > > > > > > smallest
> > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> replace
> > > > > > > > > > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > *requestChannel.sendResponse()
> > > > > > > > > > *
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > local
> > > > > > > > > > > > > > > > > > > >> method
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > *sendResponseMaybeThrottle()*
> > > > > > > > > > > that
> > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > throttling
> > > > > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > response. If
> > we
> > > > > > throttle
> > > > > > > > > first
> > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > within the
> > > method
> > > > > > > handling
> > > > > > > > > the
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > throttling.
> We
> > > can
> > > > > > look
> > > > > > > > into
> > > > > > > > > > > this
> > > > > > > > > > > > > > again
> > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> PR
> > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb
> > 22,
> > > > 2017
> > > > > > at
> > > > > > > > 5:55
> > > > > > > > > > PM,
> > > > > > > > > > > > > Roger
> > > > > > > > > > > > > > > > > Hoover
> > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > roger.hoover@gmail.com>
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Great to
> see
> > > > this
> > > > > > KIP
> > > > > > > > and
> > > > > > > > > > the
> > > > > > > > > > > > > > > excellent
> > > > > > > > > > > > > > > > > > > >> discussion.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > To me,
> Jun's
> > > > > > > suggestion
> > > > > > > > > > makes
> > > > > > > > > > > > > sense.
> > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > my
> > > > > > > > > > > > > > > > > > > >> > >> application
> > > > > > > > > > > > > > > > > > > >> > >> > is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > handler
> > > > > > unit,
> > > > > > > > then
> > > > > > > > > > > it's
> > > > > > > > > > > > as
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > > > > >> > Kafka
> > > > > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > handler
> > > > > > thread
> > > > > > > > > > > dedicated
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > me.
> > > > > > > > > > > > > > > > > > > That's
> > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > least.
> That
> > > > > > > allocation
> > > > > > > > > > > doesn't
> > > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > > even
> > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > >> an
> > > > > > > > > > > > > > > > > > > >> > >> admin
> > > > > > > > > > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > size of
> the
> > > > > request
> > > > > > > > thread
> > > > > > > > > > > pool
> > > > > > > > > > > > on
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > broker.
> > > > > > > > > > > > > > > > > > > >> > It's
> > > > > > > > > > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> abstraction
> > > that
> > > > > VMs
> > > > > > > and
> > > > > > > > > > > > > containers
> > > > > > > > > > > > > > > get
> > > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > While
> > > different
> > > > > > client
> > > > > > > > > > access
> > > > > > > > > > > > > > patterns
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > >> > wildly
> > > > > > > > > > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > thread
> > > > > > > resources
> > > > > > > > > per
> > > > > > > > > > > > > > request,
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > given
> > > > > > > > > > > > > > > > > > > >> > >> > application
> > > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > have a
> > stable
> > > > > access
> > > > > > > > > pattern
> > > > > > > > > > > and
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > "request
> > > thread
> > > > > > units"
> > > > > > > > it
> > > > > > > > > > > needs
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > meet
> > > > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > On Wed,
> Feb
> > > 22,
> > > > > 2017
> > > > > > > at
> > > > > > > > > 8:53
> > > > > > > > > > > AM,
> > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi,
> > Rajini,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks
> for
> > > the
> > > > > > > updated
> > > > > > > > > > KIP.
> > > > > > > > > > > A
> > > > > > > > > > > > > few
> > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > >> comments.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A
> > concern
> > > > of
> > > > > > > > > > > > > > request_time_percent
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > >> it's
> > > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > > >> > >> > an
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Let's
> say
> > > you
> > > > > > give a
> > > > > > > > > user
> > > > > > > > > > a
> > > > > > > > > > > > 10%
> > > > > > > > > > > > > > > limit.
> > > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > admin
> > > > > > > > > > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > handler
> > > > > > > > threads,
> > > > > > > > > > > that
> > > > > > > > > > > > > user
> > > > > > > > > > > > > > > now
> > > > > > > > > > > > > > > > > > > >> actually
> > > > > > > > > > > > > > > > > > > >> > has
> > > > > > > > > > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> capacity.
> > > This
> > > > > may
> > > > > > > > > confuse
> > > > > > > > > > > > > people
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > bit.
> > > > > > > > > > > > > > > > > > So,
> > > > > > > > > > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > based on
> > an
> > > > > > absolute
> > > > > > > > > > request
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > unit
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> > better.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > > > > > > > > > ControlledShutdownRequest
> > > > > > > > > > > > is
> > > > > > > > > > > > > > also
> > > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > excluded
> > > > from
> > > > > > > > > > throttling.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 3.
> > > > > Implementation
> > > > > > > > wise,
> > > > > > > > > I
> > > > > > > > > > am
> > > > > > > > > > > > > > > wondering
> > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > time
> > > > throttling
> > > > > > > first
> > > > > > > > in
> > > > > > > > > > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > throttling
> > > > > > logic
> > > > > > > > in
> > > > > > > > > > each
> > > > > > > > > > > > > type
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > request.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed,
> > Feb
> > > > 22,
> > > > > > 2017
> > > > > > > > at
> > > > > > > > > > 5:58
> > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank
> > you
> > > > for
> > > > > > the
> > > > > > > > > > review.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I have
> > > > > reverted
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > > > original
> > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > utilization.
> > > > > At
> > > > > > > the
> > > > > > > > > > > moment,
> > > > > > > > > > > > it
> > > > > > > > > > > > > > > uses
> > > > > > > > > > > > > > > > > > > >> percentage,
> > > > > > > > > > > > > > > > > > > >> > >> but
> > > > > > > > > > > > > > > > > > > >> > >> > I
> > > > > > > > > > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > a
> > fraction
> > > > > (out
> > > > > > > of 1
> > > > > > > > > > > instead
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > 100)
> > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > from
> > this
> > > > > > > discussion
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > KIP.
> > > > > > > > > > > > > > > > Also
> > > > > > > > > > > > > > > > > > > added
> > > > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> address
> > > > > network
> > > > > > > > thread
> > > > > > > > > > > > > > > utilization.
> > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > "request_time_percent"
> > > > > > > > > > > with
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > expectation
> > > > > > > > > > > > > > > > > > > >> > that
> > > > > > > > > > > > > > > > > > > >> > >> it
> > > > > > > > > > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > for
> > > > > > network
> > > > > > > > > thread
> > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > users
> > have
> > > > to
> > > > > > set
> > > > > > > > only
> > > > > > > > > > one
> > > > > > > > > > > > > > config
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > > >> > >> > not
> > > > > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > internal
> > > > > > > > > > distribution
> > > > > > > > > > > of
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > work
> > > > > > > > > > > > > > > > > > > >> between
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > two
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> Regards,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > On
> Wed,
> > > Feb
> > > > > 22,
> > > > > > > 2017
> > > > > > > > > at
> > > > > > > > > > > > 12:23
> > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > Rao
> > > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi,
> > > > Rajini,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> Thanks
> > > for
> > > > > the
> > > > > > > > > > proposal.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > The
> > > > benefit
> > > > > of
> > > > > > > > using
> > > > > > > > > > the
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > >> processing
> > > > > > > > > > > > > > > > > > > >> > >> time
> > > > > > > > > > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > exactly
> > > > what
> > > > > > > > people
> > > > > > > > > > have
> > > > > > > > > > > > > > said. I
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > > > >> > >> expand
> > > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > following
> > > > > > case.
> > > > > > > > The
> > > > > > > > > > > > producer
> > > > > > > > > > > > > > > > sends a
> > > > > > > > > > > > > > > > > > > >> produce
> > > > > > > > > > > > > > > > > > > >> > >> > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > but
> > > > > compressed
> > > > > > > to
> > > > > > > > > > 100KB
> > > > > > > > > > > > with
> > > > > > > > > > > > > > > gzip.
> > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> broker
> > > > could
> > > > > > > take
> > > > > > > > > > 10-15
> > > > > > > > > > > > > > seconds,
> > > > > > > > > > > > > > > > > > during
> > > > > > > > > > > > > > > > > > > >> which
> > > > > > > > > > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> thread
> > > is
> > > > > > > > completely
> > > > > > > > > > > > > blocked.
> > > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > >> case,
> > > > > > > > > > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > request
> > > > > > rate
> > > > > > > > > quota
> > > > > > > > > > > may
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> Consider
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > another
> > > > > case.
> > > > > > A
> > > > > > > > > > consumer
> > > > > > > > > > > > > group
> > > > > > > > > > > > > > > > > starts
> > > > > > > > > > > > > > > > > > > >> with 10
> > > > > > > > > > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > switches
> > > > to
> > > > > 20
> > > > > > > > > > > instances.
> > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > > > >> > will
> > > > > > > > > > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > actually
> > > > > load
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > > broker
> > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > double
> > > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > contains
> > > > > half
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > > > > partitions.
> > > > > > > > > > > > > > > > > > Request
> > > > > > > > > > > > > > > > > > > >> rate
> > > > > > > > > > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > configure
> > > > in
> > > > > > > this
> > > > > > > > > > case.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > What
> > we
> > > > > really
> > > > > > > > want
> > > > > > > > > is
> > > > > > > > > > > to
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > of
> the
> > > > > server
> > > > > > > side
> > > > > > > > > > > > > resources.
> > > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > >> > >> particular
> > > > > > > > > > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > capacity
> > > > of
> > > > > > the
> > > > > > > > > > request
> > > > > > > > > > > > > > handler
> > > > > > > > > > > > > > > > > > > threads. I
> > > > > > > > > > > > > > > > > > > >> > >> agree
> > > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > intuitive
> > > > > for
> > > > > > > the
> > > > > > > > > > users
> > > > > > > > > > > to
> > > > > > > > > > > > > > > > determine
> > > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > >> > set
> > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > this
> > is
> > > > not
> > > > > > > > > completely
> > > > > > > > > > > new
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > > been
> > > > > > > > > > > > > > > > > > > >> done
> > > > > > > > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > already.
> > > > For
> > > > > > > > > example,
> > > > > > > > > > > > Linux
> > > > > > > > > > > > > > > > cgroup (
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > https://access.redhat.com/
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > Resource_Management_Guide/sec-
> > > > > > > > > > > > > > > > > > cpu.html)
> > > > > > > > > > > > > > > > > > > >> has
> > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > cpu.cfs_quota_us,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> which
> > > > > > specifies
> > > > > > > > the
> > > > > > > > > > > total
> > > > > > > > > > > > > > amount
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> tasks
> > > in a
> > > > > > > cgroup
> > > > > > > > > can
> > > > > > > > > > > run
> > > > > > > > > > > > > > > during a
> > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > > >> second
> > > > > > > > > > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > potentially
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> model
> > > the
> > > > > > > request
> > > > > > > > > > > handler
> > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > > > > > >> > similar
> > > > > > > > > > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > request
> > > > > > handler
> > > > > > > > > thread
> > > > > > > > > > > can
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > 1
> > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > >> > handler
> > > > > > > > > > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > configure
> > > > a
> > > > > > > limit
> > > > > > > > on
> > > > > > > > > > how
> > > > > > > > > > > > > many
> > > > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > > > (say
> > > > > > > > > > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > > > > > > > > > >> > >> a
> > > > > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > Regarding
> > > > > not
> > > > > > > > > > throttling
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > internal
> > > > > > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > do
> > that.
> > > > > > > > > > Alternatively,
> > > > > > > > > > > we
> > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > > > let
> > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > the
> > > > > kafka
> > > > > > > user
> > > > > > > > > (it
> > > > > > > > > > > may
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> do
> > > > > > > > > > > > > > > > > > > >> > >> that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> clientId
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > though).
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > Ideally
> > > we
> > > > > > want
> > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > able
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > protect
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool
> > > too.
> > > > > The
> > > > > > > > > > difficult
> > > > > > > > > > > is
> > > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > >> Rajini
> > > > > > > > > > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > throttling
> > > > > the
> > > > > > > > > > requests
> > > > > > > > > > > is
> > > > > > > > > > > > > > > through
> > > > > > > > > > > > > > > > > > > >> Purgatory
> > > > > > > > > > > > > > > > > > > >> > >> and
> > > > > > > > > > > > > > > > > > > >> > >> > we
> > > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > through
> > > > how
> > > > > to
> > > > > > > > > > integrate
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > into
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > network
> > > > > > > > > > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> layer,
> > > > > > currently
> > > > > > > > we
> > > > > > > > > > know
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > user,
> > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> it's a
> > > bit
> > > > > > > tricky
> > > > > > > > to
> > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > based
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > >> > clientId
> > > > > > > > > > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> quota
> > > can
> > > > > > > already
> > > > > > > > > > > protect
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > >> thread
> > > > > > > > > > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > requests.
> > > > > So,
> > > > > > if
> > > > > > > > we
> > > > > > > > > > > can't
> > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > >> > part
> > > > > > > > > > > > > > > > > > > >> > >> > right
> > > > > > > > > > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > request
> > > > > > > > handling
> > > > > > > > > > > > threads
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > Thanks,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > On
> > Tue,
> > > > Feb
> > > > > > 21,
> > > > > > > > 2017
> > > > > > > > > > at
> > > > > > > > > > > > 4:27
> > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > Thank
> > > > you
> > > > > > all
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > > > > > feedback.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > Jay: I
> > > > > have
> > > > > > > > > removed
> > > > > > > > > > > > > > exemption
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> consumer
> > > > > > > > > > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > protecting
> > > > > > the
> > > > > > > > > > cluster
> > > > > > > > > > > > is
> > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > important
> > > > > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> Have
> > > > > > retained
> > > > > > > > the
> > > > > > > > > > > > > exemption
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > StopReplicat/LeaderAndIsr
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > throttled
> > > > > > only
> > > > > > > > if
> > > > > > > > > > > > > > > authorization
> > > > > > > > > > > > > > > > > > fails
> > > > > > > > > > > > > > > > > > > >> (so
> > > > > > > > > > > > > > > > > > > >> > >> can't
> > > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a
> > > secure
> > > > > > > > cluster,
> > > > > > > > > > but
> > > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > > >> inter-broker
> > > > > > > > > > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > delays).
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I
> > will
> > > > > wait
> > > > > > > > > another
> > > > > > > > > > > day
> > > > > > > > > > > > to
> > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > > these
> > > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > > >> > >> any
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > request
> > > > > > > > processing
> > > > > > > > > > > time
> > > > > > > > > > > > > (as
> > > > > > > > > > > > > > > > > opposed
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> request
> > > > > > > > > > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > objections,
> > > > > > I
> > > > > > > > will
> > > > > > > > > > > > revert
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> original
> > > > > > > > > > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> The
> > > > > original
> > > > > > > > > > proposal
> > > > > > > > > > > > was
> > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > > including
> > > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > handler
> > > > > > > threads
> > > > > > > > > > (that
> > > > > > > > > > > > made
> > > > > > > > > > > > > > > > > > calculation
> > > > > > > > > > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > include
> > > > > the
> > > > > > > time
> > > > > > > > > > spent
> > > > > > > > > > > > in
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > significant.
> > > > > > > As
> > > > > > > > > Jay
> > > > > > > > > > > > > pointed
> > > > > > > > > > > > > > > out,
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > total
> > > > > > > available
> > > > > > > > > CPU
> > > > > > > > > > > time
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > convert
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > threads
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> and
> > > *n*
> > > > > > > network
> > > > > > > > > > > threads.
> > > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we
> > > want,
> > > > > but
> > > > > > > it
> > > > > > > > > can
> > > > > > > > > > be
> > > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > expensive
> > > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > >> > some
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > Guozhang
> > > > > > have
> > > > > > > > > > pointed
> > > > > > > > > > > > out,
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > >> > several
> > > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > generating
> > > > > > > > metrics
> > > > > > > > > > > that
> > > > > > > > > > > > we
> > > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > > use,
> > > > > > > > > > > > > > > > > > > >> > though
> > > > > > > > > > > > > > > > > > > >> > >> we
> > > > > > > > > > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > nanoTime()
> > > > > > > > instead
> > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > small
> > > > > > requests
> > > > > > > > may
> > > > > > > > > > be
> > > > > > > > > > > <
> > > > > > > > > > > > > 1ms.
> > > > > > > > > > > > > > > But
> > > > > > > > > > > > > > > > > > > rather
> > > > > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > > > > >> > >> add
> > > > > > > > > > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > thread
> > > > and
> > > > > > > > network
> > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > wouldn't
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > > >> > >> better
> > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> spent
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on
> > > each
> > > > > > thread
> > > > > > > > > into
> > > > > > > > > > a
> > > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > ratio?
> > > > > > > > > > > > > > > > > > > >> UserA
> > > > > > > > > > > > > > > > > > > >> > >> has
> > > > > > > > > > > > > > > > > > > >> > >> > a
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we
> > > take
> > > > > that
> > > > > > > to
> > > > > > > > > mean
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > UserA
> > > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > > >> 5%
> > > > > > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> threads
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> and
> > 5%
> > > > of
> > > > > > the
> > > > > > > > time
> > > > > > > > > > on
> > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > threads?
> > > > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > > > >> > either
> > > > > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > throttled
> > > > > -
> > > > > > it
> > > > > > > > > would
> > > > > > > > > > > > mean
> > > > > > > > > > > > > > > > > > maintaining
> > > > > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > > > > >> > >> sets
> > > > > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > durations,
> > > > > > but
> > > > > > > > > would
> > > > > > > > > > > > > result
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > quota
> > > > > limits
> > > > > > > > > (UserA
> > > > > > > > > > > has
> > > > > > > > > > > > 5%
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > >> > threads
> > > > > > > > > > > > > > > > > > > >> > >> > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > threads),
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> but
> > > that
> > > > > > seems
> > > > > > > > > > > > unnecessary
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > harder
> > > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> explain
> > > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> Back
> > > to
> > > > > why
> > > > > > > and
> > > > > > > > > how
> > > > > > > > > > > > quotas
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > applied
> > > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > > >> > >> > network
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> utilization:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a)
> > In
> > > > the
> > > > > > case
> > > > > > > > of
> > > > > > > > > > > fetch,
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > >> spent in
> > > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > significant
> > > > > > > and
> > > > > > > > I
> > > > > > > > > > can
> > > > > > > > > > > > see
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > include
> > > > > > > > > > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > requests
> > > > > > where
> > > > > > > > the
> > > > > > > > > > > > network
> > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of
> > > > fetch,
> > > > > > > > request
> > > > > > > > > > > > handler
> > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > >> > utilization
> > > > > > > > > > > > > > > > > > > >> > >> > would
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> high
> > > > > request
> > > > > > > > rate,
> > > > > > > > > > low
> > > > > > > > > > > > > data
> > > > > > > > > > > > > > > > volume
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> > fetch
> > > > > > > > > > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > throttle
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > clients
> > > > > with
> > > > > > > > high
> > > > > > > > > > data
> > > > > > > > > > > > > > volume.
> > > > > > > > > > > > > > > > > > Network
> > > > > > > > > > > > > > > > > > > >> > thread
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > proportional
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > data
> > > > > > > > > > > > > > > > volume. I
> > > > > > > > > > > > > > > > > > am
> > > > > > > > > > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > throttle
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > based
> > > on
> > > > > > > network
> > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > > >> > >> whether
> > > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> covers
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> this
> > > > case.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b)
> > At
> > > > the
> > > > > > > > moment,
> > > > > > > > > we
> > > > > > > > > > > > > record
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > > > > >> > >> quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> time.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> If a
> > > > quota
> > > > > > is
> > > > > > > > > > > violated,
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> disk
> > > > reads
> > > > > > for
> > > > > > > > > > fetches
> > > > > > > > > > > > > > > happening
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> network
> > > > > > > > > > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > delay
> > > a
> > > > > > > response
> > > > > > > > > > after
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > disk
> > > > > > > > > > > > > > > > > > reads.
> > > > > > > > > > > > > > > > > > > >> We
> > > > > > > > > > > > > > > > > > > >> > >> could
> > > > > > > > > > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> the
> > > > > network
> > > > > > > > thread
> > > > > > > > > > > when
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> > >> complete
> > > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > handling a
> > > > > > > > > > subsequent
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > (separate
> > > > > > > > > > > > > > > > > > > >> out
> > > > > > > > > > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > violation
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > handling
> > > > > in
> > > > > > > the
> > > > > > > > > case
> > > > > > > > > > > of
> > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > Regards,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > Rajini
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On
> > > Tue,
> > > > > Feb
> > > > > > > 21,
> > > > > > > > > 2017
> > > > > > > > > > > at
> > > > > > > > > > > > > 2:58
> > > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > > > Becket
> > > > > > > > > > > > > > > > > > > >> > Qin <
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > becket.qin@gmail.com>
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > Hey
> > > > Jay,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > Yeah,
> > > > I
> > > > > > > agree
> > > > > > > > > that
> > > > > > > > > > > > > > enforcing
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > > > >> > >> is a
> > > > > > > > > > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > thinking
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > that
> > > > > maybe
> > > > > > > we
> > > > > > > > > can
> > > > > > > > > > > use
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > > >> > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> already
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > very
> > > > > > > detailed
> > > > > > > > so
> > > > > > > > > > we
> > > > > > > > > > > > can
> > > > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> e.g.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > something
> > > > > > > like
> > > > > > > > > > > > > > (total_time -
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > request/response_queue_time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > remote_time).
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> I
> > > > agree
> > > > > > with
> > > > > > > > > > > Guozhang
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > > >> user is
> > > > > > > > > > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > need
> > > > to
> > > > > > see
> > > > > > > if
> > > > > > > > > > > > anything
> > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > went
> > > > > > > > > > > > > > > > > > > wrong
> > > > > > > > > > > > > > > > > > > >> > >> first,
> > > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > behaving
> > > > > > and
> > > > > > > > > just
> > > > > > > > > > > need
> > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > > >> resources, we
> > > > > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > for
> > > > > them.
> > > > > > It
> > > > > > > > is
> > > > > > > > > > true
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > users
> > > > is
> > > > > > > > > > difficult.
> > > > > > > > > > > So
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > practice
> > > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > > >> > would
> > > > > > > > > > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> set
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> a
> > > > > relative
> > > > > > > > high
> > > > > > > > > > > > > protective
> > > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > > >> quota
> > > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > for
> > > > some
> > > > > > > > > > individual
> > > > > > > > > > > > > > clients
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > > demand.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > Thanks,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > Jiangjie
> > > > > > > > > (Becket)
> > > > > > > > > > > Qin
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> On
> > > > Mon,
> > > > > > Feb
> > > > > > > > 20,
> > > > > > > > > > 2017
> > > > > > > > > > > > at
> > > > > > > > > > > > > > 5:48
> > > > > > > > > > > > > > > > PM,
> > > > > > > > > > > > > > > > > > > >> Guozhang
> > > > > > > > > > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > wangguoz@gmail.com
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > wrote:
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > This
> > > > > is
> > > > > > a
> > > > > > > > > great
> > > > > > > > > > > > > > proposal,
> > > > > > > > > > > > > > > > glad
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> see
> > > > > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > I
> > > am
> > > > > > > > inclined
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > > >> throttling, or
> > > > > > > > > > > > > > > > > > > >> > >> more
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > processing
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> time
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > ratio
> > > > > > > > instead
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > well
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > summed
> > > > > > my
> > > > > > > > > > > rationales
> > > > > > > > > > > > > > > above,
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > former
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > has
> > > > a
> > > > > > good
> > > > > > > > > > support
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > both
> > > > > > > > > > > > > > > > > > > >> "protecting
> > > > > > > > > > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > > > "utilizing a
> > > > > > > > > > > cluster
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > explain
> > > > > > > this
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > end
> > > > > > > > > > > > > > > > > users, I
> > > > > > > > > > > > > > > > > > > >> find
> > > > > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > request
> > > > > > > rate
> > > > > > > > > > since
> > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > mentioned
> > > > > > > > > > > > > > > > > > > >> above,
> > > > > > > > > > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > quite
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > > different
> > > > > > > > > > "cost",
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > Kafka
> > > > > > > > > > > > > > > > > > today
> > > > > > > > > > > > > > > > > > > >> > already
> > > > > > > > > > > > > > > > > > > >> > >> > have
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > > (produce,
> > > > > > > > > fetch,
> > > > > > > > > > > > > admin,
> > > > > > > > > > > > > > > > > > metadata,
> > > > > > > > > > > > > > > > > > > >> etc),
> > > > > > > > > > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> rate
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > > throttling
> > > > > > > > may
> > > > > > > > > > not
> > > > > > > > > > > > be
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > conservatively.
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > > Regarding
> > > > > > > to
> > > > > > > > > > user
> > > > > > > > > > > > > > > reactions
> > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > > >> they
> > > > > > > > > > > > > > > > > > > >> > are
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > differ
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > > > > case-by-case,
> > > > > > > > > > and
> > > > > > > > > > > > need
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > >> > discovered /
> > > > > > > > > > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > relative
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > metrics.
> > > > > > > So
> > > > > > > > in
> > > > > > > > > > > other
> > > > > > > > > > > > > > words
> > > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > > >> would
> > > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> additional
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > > > information
> > > > > > > > by
> > > > > > > > > > > > simply
> > > > > > > > > > > > > > > being
> > > > > > > > > > > > > > > > > told
> > > > > > > > > > > > > > > > > > > >> "hey,
> > > > > > > > > > > > > > > > > > > >> > >> you
> > > > > > > > > > > > > > > > > > > >> > >> > are
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > what
> > > > > > > > > throttling
> > > > > > > > > > > > does;
> > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > >> > take a
> > > > > > > > > > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> "hmm,
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > I'm
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > > throttled
> > > > > > > > > > probably
> > > > > > > > > > > > > > because
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > ..",
> > > > > > > > > > > > > > > > > > > >> > which
> > > > > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > > > > >> > >> > by
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> metric
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> >
> > > > > values:
> > > > > > > e.g.
> > > > > > > > > > > whether
> > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > > > > bombarding
> > > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > > [Message clipped]
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > *Todd Palino*
> > > > > > > > > > Staff Site Reliability Engineer
> > > > > > > > > > Data Infrastructure Streaming
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > linkedin.com/in/toddpalino
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > *Todd Palino*
> > > > > > > > Staff Site Reliability Engineer
> > > > > > > > Data Infrastructure Streaming
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > linkedin.com/in/toddpalino
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Jun,

50. Yes, that makes sense. I have updated the KIP.

Thank you,

Rajini

On Mon, Mar 13, 2017 at 7:35 PM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Rajini,
>
> Thanks for the updated KIP. Looks good. Just one more thing.
>
> 50. "Two new metrics request-throttle-time-max and
> request-throttle-time-min
>  will be added to reflect total request processing time based throttling
> for all request types including produce/fetch." The most important clients
> are producer and consumer, which already have the
> produce/fetch-throttle-time-min/max
> metrics. Should we just accumulate the throttled time for other requests
> into these two existing metrics, instead of introducing new ones? We can
> probably add a similar metric for the admin client later on.
>
> Jun
>
>
> On Thu, Mar 9, 2017 at 2:24 PM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Jun,
> >
> > 40. Yes you are right, a single value tracking the total exempt time is
> > sufficient. Have updated the KIP.
> >
> > Thank you,
> >
> > Rajini
> >
> > On Thu, Mar 9, 2017 at 9:42 PM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Rajini,
> > >
> > > The updated KIP looks good. Just one more comment.
> > >
> > > 40. "An additional metric exempt-request-time will also be added for
> each
> > > quota entity for the quota type Request." Should that metric be added
> for
> > > each entity type (e.g., user, client-id, etc)? It seems that value is
> > > independent of entity types.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Mar 9, 2017 at 12:07 PM, Rajini Sivaram <
> rajinisivaram@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi Jun,
> > > >
> > > > Thank you for reviewing the KIP again.
> > > >
> > > > 30. That is a good idea. In fact, it is one of the advantages of
> > > measuring
> > > > overall utilization rather than separate values for network and I/O
> > > threads
> > > > as I had intended initially. Have updated the KIP, thanks.
> > > >
> > > > 31. Added exempt-request-time metric.
> > > >
> > > > 32. I had thought of using quota.window.size.seconds *
> quota.window.num
> > > > initially, but felt that would be too big. Even the default of 11
> > seconds
> > > > is a rather long time to be throttled. With a limit of
> > > > quota.window.size.seconds, subsequent requests for that total
> interval
> > of
> > > > the samples will also each be throttled for quota.window.size.seconds
> > if
> > > > the time recorded was very high. So limiting at
> > quota.window.size.seconds
> > > > limits the throttle time for an individual request, avoiding timeouts
> > > where
> > > > possible, but still throttles over a period of time.
> > > >
> > > > 33. Updated to use request_percentage.
> > > >
> > > >
> > > > On Thu, Mar 9, 2017 at 5:40 PM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Hi, Rajini,
> > > > >
> > > > > Thanks for the updated KIP. A few more comments.
> > > > >
> > > > > 30. Should we just account for the time in network threads in this
> > KIP
> > > > too?
> > > > > The issue with doing this later is that existing quotas may be too
> > > small
> > > > > and everyone will have to adjust them before upgrading, which is
> > > > > inconvenient. If we just do the delaying in the io threads, there
> > > > probably
> > > > > isn't too much additional work to include the network thread time?
> > > > >
> > > > > 31. It would be useful for the new metrics to capture the
> utilization
> > > of
> > > > > all those requests exempt from request throttling (under sth like
> > > > > "exempt"). It's useful for an admin to know how much time is spent
> > > there
> > > > > too.
> > > > >
> > > > > 32. "The maximum throttle time for any single request will be the
> > quota
> > > > > window size (one second by default)." We probably should cap the
> > delay
> > > at
> > > > > quota.window.size.seconds * quota.window.num?
> > > > >
> > > > > 33. It's unfortunate that we use . in configs and _ in ZK data
> > > > structures.
> > > > > However, for consistency, request.percentage in ZK probably should
> be
> > > > > request_percentage?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Thu, Mar 9, 2017 at 7:55 AM, Rajini Sivaram <
> > > rajinisivaram@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > I have updated the KIP to use "request.percentage" quotas where
> the
> > > > > > percentage is out of a total of (num.io.threads * 100). I have
> > added
> > > > the
> > > > > > other options considered so far under "Rejected Alternatives".
> > > > > >
> > > > > > To address Todd's concern about per-thread quotas: Even though
> the
> > > > quotas
> > > > > > are out of (num.io.threads * 100)  clients are not locked into
> > > threads.
> > > > > > Utilization is measured as the total across all the I/O threads
> and
> > > 10
> > > > %
> > > > > > quota can be 1% of 10 threads. Individual quotas can also be
> > greater
> > > > than
> > > > > > 100% if required.
> > > > > >
> > > > > > Please let me know if there are any other concerns or
> suggestions.
> > > > > >
> > > > > > Thank you,
> > > > > >
> > > > > > Rajini
> > > > > >
> > > > > > On Wed, Mar 8, 2017 at 10:20 PM, Todd Palino <tp...@gmail.com>
> > > > wrote:
> > > > > >
> > > > > > > Rajini -
> > > > > > >
> > > > > > > I understand what you’re saying, but the point I’m making is
> > that I
> > > > > don’t
> > > > > > > believe we need to take it into account directly. The CPU
> > > utilization
> > > > > of
> > > > > > > the network threads is directly proportional to the number of
> > bytes
> > > > > being
> > > > > > > sent. The more bytes, the more CPU that is required for SSL (or
> > > other
> > > > > > > tasks). This is opposed to the request handler threads, where
> > there
> > > > > are a
> > > > > > > number of factors that affect CPU utilization. This means that
> > it’s
> > > > not
> > > > > > > necessary to separately quota network thread byte usage and
> CPU -
> > > if
> > > > we
> > > > > > > quota byte usage (which we already do), we have fixed the CPU
> > usage
> > > > at
> > > > > a
> > > > > > > proportional amount.
> > > > > > >
> > > > > > > Jun -
> > > > > > >
> > > > > > > Thanks for the clarification there. I was thinking of the
> > > utilization
> > > > > > > percentage as being fixed, not what the percentage reflects.
> I’m
> > > not
> > > > > tied
> > > > > > > to either way of doing it, provided that we do not lock clients
> > to
> > > a
> > > > > > single
> > > > > > > thread. For example, if I specify that a given client can use
> 10%
> > > of
> > > > a
> > > > > > > single thread, that should also mean they can use 1% on 10
> > threads.
> > > > > > >
> > > > > > > -Todd
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > > > > >
> > > > > > > > Hi, Todd,
> > > > > > > >
> > > > > > > > Thanks for the feedback.
> > > > > > > >
> > > > > > > > I just want to clarify your second point. If the limit
> > percentage
> > > > is
> > > > > > per
> > > > > > > > thread and the thread counts are changed, the absolute
> > processing
> > > > > limit
> > > > > > > for
> > > > > > > > existing users haven't changed and there is no need to adjust
> > > them.
> > > > > On
> > > > > > > the
> > > > > > > > other hand, if the limit percentage is of total thread pool
> > > > capacity
> > > > > > and
> > > > > > > > the thread counts are changed, the effective processing limit
> > > for a
> > > > > > user
> > > > > > > > will change. So, to preserve the current processing limit,
> > > existing
> > > > > > user
> > > > > > > > limits have to be adjusted. If there is a hardware change,
> the
> > > > > > effective
> > > > > > > > processing limit for a user will change in either approach
> and
> > > the
> > > > > > > existing
> > > > > > > > limit may need to be adjusted. However, hardware changes are
> > less
> > > > > > common
> > > > > > > > than thread pool configuration changes.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <
> tpalino@gmail.com
> > >
> > > > > wrote:
> > > > > > > >
> > > > > > > > > I’ve been following this one on and off, and overall it
> > sounds
> > > > good
> > > > > > to
> > > > > > > > me.
> > > > > > > > >
> > > > > > > > > - The SSL question is a good one. However, that type of
> > > overhead
> > > > > > should
> > > > > > > > be
> > > > > > > > > proportional to the bytes rate, so I think that a bytes
> rate
> > > > quota
> > > > > > > would
> > > > > > > > > still be a suitable way to address it.
> > > > > > > > >
> > > > > > > > > - I think it’s better to make the quota percentage of total
> > > > thread
> > > > > > pool
> > > > > > > > > capacity, and not percentage of an individual thread. That
> > way
> > > > you
> > > > > > > don’t
> > > > > > > > > have to adjust it when you adjust thread counts (tuning,
> > > hardware
> > > > > > > > changes,
> > > > > > > > > etc.)
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > -Todd
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <
> > > becket.qin@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I see. Good point about SSL.
> > > > > > > > > >
> > > > > > > > > > I just asked Todd to take a look.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > >
> > > > > > > > > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <
> jun@confluent.io>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, Jiangjie,
> > > > > > > > > > >
> > > > > > > > > > > Yes, I agree that byte rate already protects the
> network
> > > > > threads
> > > > > > > > > > > indirectly. I am not sure if byte rate fully captures
> the
> > > CPU
> > > > > > > > overhead
> > > > > > > > > in
> > > > > > > > > > > network due to SSL. So, at the high level, we can use
> > > request
> > > > > > time
> > > > > > > > > limit
> > > > > > > > > > to
> > > > > > > > > > > protect CPU and use byte rate to protect storage and
> > > network.
> > > > > > > > > > >
> > > > > > > > > > > Also, do you think you can get Todd to comment on this
> > KIP?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <
> > > > > > becket.qin@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi Rajini/Jun,
> > > > > > > > > > > >
> > > > > > > > > > > > The percentage based reasoning sounds good.
> > > > > > > > > > > > One thing I am wondering is that if we assume the
> > network
> > > > > > thread
> > > > > > > > are
> > > > > > > > > > just
> > > > > > > > > > > > doing the network IO, can we say bytes rate quota is
> > > > already
> > > > > > sort
> > > > > > > > of
> > > > > > > > > > > > network threads quota?
> > > > > > > > > > > > If we take network threads into the consideration
> here,
> > > > would
> > > > > > > that
> > > > > > > > be
> > > > > > > > > > > > somewhat overlapping with the bytes rate quota?
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Jun,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you for the explanation, I hadn't realized
> you
> > > > meant
> > > > > > > > > percentage
> > > > > > > > > > > of
> > > > > > > > > > > > > the total thread pool. If everyone is OK with Jun's
> > > > > > > suggestion, I
> > > > > > > > > > will
> > > > > > > > > > > > > update the KIP.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <
> > > > jun@confluent.io>
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Let's take your example. Let's say a user sets
> the
> > > > limit
> > > > > to
> > > > > > > > 50%.
> > > > > > > > > I
> > > > > > > > > > am
> > > > > > > > > > > > not
> > > > > > > > > > > > > > sure if it's better to apply the same percentage
> > > > > separately
> > > > > > > to
> > > > > > > > > > > network
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > io thread pool. For example, for produce
> requests,
> > > most
> > > > > of
> > > > > > > the
> > > > > > > > > time
> > > > > > > > > > > > will
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > spent in the io threads whereas for fetch
> requests,
> > > > most
> > > > > of
> > > > > > > the
> > > > > > > > > > time
> > > > > > > > > > > > will
> > > > > > > > > > > > > > be in the network threads. So, using the same
> > > > percentage
> > > > > in
> > > > > > > > both
> > > > > > > > > > > thread
> > > > > > > > > > > > > > pools means one of the pools' resource will be
> over
> > > > > > > allocated.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > An alternative way is to simply model network and
> > io
> > > > > thread
> > > > > > > > pool
> > > > > > > > > > > > > together.
> > > > > > > > > > > > > > If you get 10 io threads and 5 network threads,
> you
> > > get
> > > > > > 1500%
> > > > > > > > > > request
> > > > > > > > > > > > > > processing power. A 50% limit means a total of
> 750%
> > > > > > > processing
> > > > > > > > > > power.
> > > > > > > > > > > > We
> > > > > > > > > > > > > > just add up the time a user request spent in
> either
> > > > > network
> > > > > > > or
> > > > > > > > io
> > > > > > > > > > > > thread.
> > > > > > > > > > > > > > If that total exceeds 750% (doesn't matter
> whether
> > > it's
> > > > > > spent
> > > > > > > > > more
> > > > > > > > > > in
> > > > > > > > > > > > > > network or io thread), the request will be
> > throttled.
> > > > > This
> > > > > > > > seems
> > > > > > > > > > more
> > > > > > > > > > > > > > general and is not sensitive to the current
> > > > > implementation
> > > > > > > > detail
> > > > > > > > > > of
> > > > > > > > > > > > > having
> > > > > > > > > > > > > > a separate network and io thread pool. In the
> > future,
> > > > if
> > > > > > the
> > > > > > > > > > > threading
> > > > > > > > > > > > > > model changes, the same concept of quota can
> still
> > be
> > > > > > > applied.
> > > > > > > > > For
> > > > > > > > > > > now,
> > > > > > > > > > > > > > since it's a bit tricky to add the delay logic in
> > the
> > > > > > network
> > > > > > > > > > thread
> > > > > > > > > > > > > pool,
> > > > > > > > > > > > > > we could probably just do the delaying only in
> the
> > io
> > > > > > threads
> > > > > > > > as
> > > > > > > > > > you
> > > > > > > > > > > > > > suggested earlier.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > There is still the orthogonal question of
> whether a
> > > > quota
> > > > > > of
> > > > > > > > 50%
> > > > > > > > > is
> > > > > > > > > > > out
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > 100% or 100% * #total processing threads. My
> > feeling
> > > is
> > > > > > that
> > > > > > > > the
> > > > > > > > > > > latter
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > slightly better based on my explanation earlier.
> > The
> > > > way
> > > > > to
> > > > > > > > > > describe
> > > > > > > > > > > > this
> > > > > > > > > > > > > > quota to the users can be "share of elapsed
> request
> > > > > > > processing
> > > > > > > > > time
> > > > > > > > > > > on
> > > > > > > > > > > > a
> > > > > > > > > > > > > > single CPU" (similar to top).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Agree about the two scenarios.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > But still not sure about a single quota
> covering
> > > both
> > > > > > > network
> > > > > > > > > > > threads
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > I/O threads with per-thread quota. If there are
> > 10
> > > > I/O
> > > > > > > > threads
> > > > > > > > > > and
> > > > > > > > > > > 5
> > > > > > > > > > > > > > > network threads and I want to assign half the
> > quota
> > > > to
> > > > > > > userA,
> > > > > > > > > the
> > > > > > > > > > > > quota
> > > > > > > > > > > > > > > would be 750%. I imagine, internally, we would
> > > > convert
> > > > > > this
> > > > > > > > to
> > > > > > > > > > 500%
> > > > > > > > > > > > for
> > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > and 250% for network threads to allocate 50% of
> > > each
> > > > > > pool.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > A couple of scenarios:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. Admin adds 1 extra network thread. To retain
> > > 50%,
> > > > > > admin
> > > > > > > > > needs
> > > > > > > > > > to
> > > > > > > > > > > > now
> > > > > > > > > > > > > > > allocate 800% for each user. Or increase the
> > quota
> > > > for
> > > > > a
> > > > > > > few
> > > > > > > > > > users.
> > > > > > > > > > > > To
> > > > > > > > > > > > > > me,
> > > > > > > > > > > > > > > it feels like admin needs to convert 50% to
> 800%
> > > and
> > > > > > Kafka
> > > > > > > > > > > internally
> > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > to convert 800% to (500%, 300%). Everyone using
> > > just
> > > > > 50%
> > > > > > > > feels
> > > > > > > > > a
> > > > > > > > > > > lot
> > > > > > > > > > > > > > > simpler.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. We decide to add some other thread to this
> > list.
> > > > > Admin
> > > > > > > > needs
> > > > > > > > > > to
> > > > > > > > > > > > know
> > > > > > > > > > > > > > > exactly how many threads form the maximum
> quota.
> > > And
> > > > we
> > > > > > can
> > > > > > > > be
> > > > > > > > > > > > changing
> > > > > > > > > > > > > > > this between broker versions as we add more to
> > the
> > > > > list.
> > > > > > > > Again
> > > > > > > > > a
> > > > > > > > > > > > single
> > > > > > > > > > > > > > > overall percent would be a lot simpler.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > There were others who were unconvinced by a
> > single
> > > > > > percent
> > > > > > > > from
> > > > > > > > > > the
> > > > > > > > > > > > > > initial
> > > > > > > > > > > > > > > proposal and were happier with thread units
> > similar
> > > > to
> > > > > > CPU
> > > > > > > > > units,
> > > > > > > > > > > so
> > > > > > > > > > > > I
> > > > > > > > > > > > > am
> > > > > > > > > > > > > > > ok with going with per-thread quotas (as units
> or
> > > > > > percent).
> > > > > > > > > Just
> > > > > > > > > > > not
> > > > > > > > > > > > > sure
> > > > > > > > > > > > > > > it makes it easier for admin in all cases.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <
> > > > > > jun@confluent.io>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Consider modeling as n * 100% unit. For 2),
> the
> > > > > > question
> > > > > > > is
> > > > > > > > > > > what's
> > > > > > > > > > > > > > > causing
> > > > > > > > > > > > > > > > the I/O threads to be saturated. It's
> unlikely
> > > that
> > > > > all
> > > > > > > > > users'
> > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > have increased at the same. A more likely
> case
> > is
> > > > > that
> > > > > > a
> > > > > > > > few
> > > > > > > > > > > > isolated
> > > > > > > > > > > > > > > > users' utilization have increased. If so,
> after
> > > > > > > increasing
> > > > > > > > > the
> > > > > > > > > > > > number
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > threads, the admin just needs to adjust the
> > quota
> > > > > for a
> > > > > > > few
> > > > > > > > > > > > isolated
> > > > > > > > > > > > > > > users,
> > > > > > > > > > > > > > > > which is expected and is less work.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Consider modeling as 1 * 100% unit. For 1),
> all
> > > > > users'
> > > > > > > > quota
> > > > > > > > > > need
> > > > > > > > > > > > to
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > adjusted, which is unexpected and is more
> work.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > So, to me, the n * 100% model seems more
> > > > convenient.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > As for future extension to cover network
> thread
> > > > > > > > utilization,
> > > > > > > > > I
> > > > > > > > > > > was
> > > > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > > > that one way is to simply model the capacity
> as
> > > (n
> > > > +
> > > > > > m) *
> > > > > > > > > 100%
> > > > > > > > > > > > unit,
> > > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > > n and m are the number of network and i/o
> > > threads,
> > > > > > > > > > respectively.
> > > > > > > > > > > > > Then,
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > each user, we can just add up the utilization
> > in
> > > > the
> > > > > > > > network
> > > > > > > > > > and
> > > > > > > > > > > > the
> > > > > > > > > > > > > > i/o
> > > > > > > > > > > > > > > > thread. If we do this, we don't need a new
> type
> > > of
> > > > > > quota.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini
> > Sivaram <
> > > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > If we use request.percentage as the
> > percentage
> > > > used
> > > > > > in
> > > > > > > a
> > > > > > > > > > single
> > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > the total percentage being allocated will
> be
> > > > > > > > > num.io.threads *
> > > > > > > > > > > 100
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > threads and num.network.threads * 100 for
> > > network
> > > > > > > > threads.
> > > > > > > > > A
> > > > > > > > > > > > single
> > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > covering the two as a percentage wouldn't
> > quite
> > > > > work
> > > > > > if
> > > > > > > > you
> > > > > > > > > > > want
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > allocate the same proportion in both cases.
> > If
> > > we
> > > > > > want
> > > > > > > to
> > > > > > > > > > treat
> > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > separate units, won't we need two quota
> > > > > > configurations
> > > > > > > > > > > regardless
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > whether we use units or percentage?
> Perhaps I
> > > > > > > > misunderstood
> > > > > > > > > > > your
> > > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I think there are two cases:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >    1. The use case that you mentioned where
> > an
> > > > > admin
> > > > > > is
> > > > > > > > > > adding
> > > > > > > > > > > > more
> > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > >    and decides to add more I/O threads and
> > > > expects
> > > > > to
> > > > > > > > find
> > > > > > > > > > free
> > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >    allocate for new users.
> > > > > > > > > > > > > > > > >    2. Admin adds more I/O threads because
> the
> > > I/O
> > > > > > > threads
> > > > > > > > > are
> > > > > > > > > > > > > > saturated
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >    there are cores available to allocate,
> > even
> > > > > though
> > > > > > > the
> > > > > > > > > > > number
> > > > > > > > > > > > or
> > > > > > > > > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > If we allocated treated I/O threads as a
> > single
> > > > > unit
> > > > > > of
> > > > > > > > > 100%,
> > > > > > > > > > > all
> > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > quotas need to be reallocated for 1). If we
> > > > > allocated
> > > > > > > I/O
> > > > > > > > > > > threads
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > n
> > > > > > > > > > > > > > > > > units with n*100%, all user quotas need to
> be
> > > > > > > reallocated
> > > > > > > > > for
> > > > > > > > > > > 2),
> > > > > > > > > > > > > > > > otherwise
> > > > > > > > > > > > > > > > > some of the new threads may just not be
> used.
> > > > > Either
> > > > > > > way
> > > > > > > > it
> > > > > > > > > > > > should
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > easy
> > > > > > > > > > > > > > > > > to write a script to decrease/increase
> quotas
> > > by
> > > > a
> > > > > > > > multiple
> > > > > > > > > > for
> > > > > > > > > > > > all
> > > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > So it really boils down to which quota unit
> > is
> > > > most
> > > > > > > > > intuitive
> > > > > > > > > > > in
> > > > > > > > > > > > > > terms
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > configuration. And from the discussion so
> > far,
> > > it
> > > > > > feels
> > > > > > > > > like
> > > > > > > > > > > > > opinion
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > divided on whether quotas should be carved
> > out
> > > of
> > > > > an
> > > > > > > > > absolute
> > > > > > > > > > > > 100%
> > > > > > > > > > > > > > (or
> > > > > > > > > > > > > > > 1
> > > > > > > > > > > > > > > > > unit) or be relative to the number of
> threads
> > > > > (n*100%
> > > > > > > or
> > > > > > > > n
> > > > > > > > > > > > units).
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <
> > > > > > > > jun@confluent.io>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Another way to express an absolute limit
> is
> > > to
> > > > > use
> > > > > > > > > > > > > > > request.percentage,
> > > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > treat it as the percentage used in a
> single
> > > > > request
> > > > > > > > > > handling
> > > > > > > > > > > > > > thread.
> > > > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > > > now, the request handling threads can be
> > just
> > > > the
> > > > > > io
> > > > > > > > > > threads.
> > > > > > > > > > > > In
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > future, they can cover the network
> threads
> > as
> > > > > well.
> > > > > > > > This
> > > > > > > > > is
> > > > > > > > > > > > > similar
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > top reports CPU usage and may be a bit
> > easier
> > > > for
> > > > > > > > people
> > > > > > > > > to
> > > > > > > > > > > > > > > understand.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun
> Rao <
> > > > > > > > > > jun@confluent.io>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > 2. Regarding request.unit vs
> > > > > request.percentage.
> > > > > > I
> > > > > > > > > > started
> > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > > request.percentage too. The reasoning
> for
> > > > > > > > request.unit
> > > > > > > > > is
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > following.
> > > > > > > > > > > > > > > > > > > Suppose that the capacity has been
> > reached
> > > > on a
> > > > > > > > broker
> > > > > > > > > > and
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > > > to add a new user. A simple way to
> > increase
> > > > the
> > > > > > > > > capacity
> > > > > > > > > > is
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > increase
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > > number of io threads, assuming there
> are
> > > > still
> > > > > > > enough
> > > > > > > > > > > cores.
> > > > > > > > > > > > If
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > > is based on percentage, the additional
> > > > capacity
> > > > > > > > > > > automatically
> > > > > > > > > > > > > > gets
> > > > > > > > > > > > > > > > > > > distributed to existing users and we
> > > haven't
> > > > > > really
> > > > > > > > > > carved
> > > > > > > > > > > > out
> > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > > additional resource for the new user.
> > Now,
> > > is
> > > > > it
> > > > > > > easy
> > > > > > > > > > for a
> > > > > > > > > > > > > user
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > reason
> > > > > > > > > > > > > > > > > > > about 0.1 unit vs 10%. My feeling is
> that
> > > > both
> > > > > > are
> > > > > > > > hard
> > > > > > > > > > and
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > > configured empirically. Not sure if
> > > > percentage
> > > > > is
> > > > > > > > > > obviously
> > > > > > > > > > > > > > easier
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > > reason about.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay
> > Kreps
> > > <
> > > > > > > > > > > jay@confluent.io
> > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> 1. Even though the implementation of
> > this
> > > > > quota
> > > > > > is
> > > > > > > > > only
> > > > > > > > > > > > using
> > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > >> time, i think we should call it
> > something
> > > > like
> > > > > > > > > > > > "request-time".
> > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > >> give us flexibility to improve the
> > > > > > implementation
> > > > > > > to
> > > > > > > > > > cover
> > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > >> in the future and will avoid exposing
> > > > internal
> > > > > > > > details
> > > > > > > > > > > like
> > > > > > > > > > > > > our
> > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> 2. Jun/Roger, I get what you are
> trying
> > to
> > > > fix
> > > > > > but
> > > > > > > > the
> > > > > > > > > > > idea
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> thread/units
> > > > > > > > > > > > > > > > > > >> is super unintuitive as a user-facing
> > > knob.
> > > > I
> > > > > > had
> > > > > > > to
> > > > > > > > > > read
> > > > > > > > > > > > the
> > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > >> eight times to understand this. I'm
> not
> > > sure
> > > > > > that
> > > > > > > > your
> > > > > > > > > > > point
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> increasing the number of threads is a
> > > > problem
> > > > > > > with a
> > > > > > > > > > > > > > > > percentage-based
> > > > > > > > > > > > > > > > > > >> value, it really depends on whether
> the
> > > user
> > > > > > > thinks
> > > > > > > > > > about
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > "percentage
> > > > > > > > > > > > > > > > > > >> of request processing time" or "thread
> > > > units".
> > > > > > If
> > > > > > > > they
> > > > > > > > > > > think
> > > > > > > > > > > > > "I
> > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > >> allocated 10% of my request processing
> > > time
> > > > to
> > > > > > > user
> > > > > > > > x"
> > > > > > > > > > > then
> > > > > > > > > > > > it
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > bug
> > > > > > > > > > > > > > > > > > >> that increasing the thread count
> > decreases
> > > > > that
> > > > > > > > > percent
> > > > > > > > > > as
> > > > > > > > > > > > it
> > > > > > > > > > > > > > does
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> current proposal. As a practical
> matter
> > I
> > > > > think
> > > > > > > the
> > > > > > > > > only
> > > > > > > > > > > way
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > actually
> > > > > > > > > > > > > > > > > > >> reason about this is as a percent---I
> > just
> > > > > don't
> > > > > > > > > believe
> > > > > > > > > > > > > people
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > going
> > > > > > > > > > > > > > > > > > >> to think, "ah, 4.3 thread units, that
> is
> > > the
> > > > > > right
> > > > > > > > > > > amount!".
> > > > > > > > > > > > > > > > Instead I
> > > > > > > > > > > > > > > > > > >> think they have to understand this
> > thread
> > > > unit
> > > > > > > > > concept,
> > > > > > > > > > > > figure
> > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > >> they have set in number of threads,
> > > compute
> > > > a
> > > > > > > > percent
> > > > > > > > > > and
> > > > > > > > > > > > then
> > > > > > > > > > > > > > > come
> > > > > > > > > > > > > > > > up
> > > > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > > > >> the number of thread units, and these
> > will
> > > > all
> > > > > > be
> > > > > > > > > wrong
> > > > > > > > > > if
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > >> count changes. I also think this ties
> us
> > > to
> > > > > > > > throttling
> > > > > > > > > > the
> > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > >> pool,
> > > > > > > > > > > > > > > > > > >> which may not be where we want to end
> > up.
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> 3. For what it's worth I do think
> > having a
> > > > > > single
> > > > > > > > > > > > throttle_ms
> > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > >> the responses that combines all
> > throttling
> > > > > from
> > > > > > > all
> > > > > > > > > > quotas
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> simplest. There could be a use case
> for
> > > > having
> > > > > > > > > separate
> > > > > > > > > > > > fields
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > each,
> > > > > > > > > > > > > > > > > > >> but I think that is actually harder to
> > > > > > use/monitor
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > > common
> > > > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > >> unless someone has a use case I think
> > just
> > > > one
> > > > > > > > should
> > > > > > > > > be
> > > > > > > > > > > > fine.
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM,
> Rajini
> > > > > Sivaram
> > > > > > <
> > > > > > > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > >> > I have updated the KIP based on the
> > > > > > discussions
> > > > > > > so
> > > > > > > > > > far.
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > Regards,
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > Rajini
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM,
> > Rajini
> > > > > > > Sivaram <
> > > > > > > > > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > > Ismael #1. It makes sense not to
> > > > throttle
> > > > > > > > > > inter-broker
> > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > >> > > LeaderAndIsr etc. The simplest way
> > to
> > > > > ensure
> > > > > > > > that
> > > > > > > > > > > > clients
> > > > > > > > > > > > > > > cannot
> > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > >> > these
> > > > > > > > > > > > > > > > > > >> > > requests to bypass quotas for DoS
> > > > attacks
> > > > > is
> > > > > > > to
> > > > > > > > > > ensure
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > ACLs
> > > > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > > > >> > > clients from using these requests
> > and
> > > > > > > > unauthorized
> > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > >> included
> > > > > > > > > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking
> > > that
> > > > > > these
> > > > > > > > > quotas
> > > > > > > > > > > can
> > > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > >> > separate
> > > > > > > > > > > > > > > > > > >> > > throttle time, and all utilization
> > > based
> > > > > > > quotas
> > > > > > > > > > could
> > > > > > > > > > > > use
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > > > >> > > (we won't add another one for
> > network
> > > > > thread
> > > > > > > > > > > utilization
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> instance).
> > > > > > > > > > > > > > > > > > >> > But
> > > > > > > > > > > > > > > > > > >> > > perhaps it makes sense to keep
> byte
> > > rate
> > > > > > > quotas
> > > > > > > > > > > separate
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > > > > > > > > >> > > responses to provide separate
> > metrics?
> > > > > Agree
> > > > > > > > with
> > > > > > > > > > > Ismael
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> name of
> > > > > > > > > > > > > > > > > > >> > > the existing field should be
> changed
> > > if
> > > > we
> > > > > > > have
> > > > > > > > > two.
> > > > > > > > > > > > Happy
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > switch
> > > > > > > > > > > > > > > > > > >> to a
> > > > > > > > > > > > > > > > > > >> > > single combined throttle time if
> > that
> > > is
> > > > > > > > > sufficient.
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > > Ismael #4, #5, #6: Will update
> KIP.
> > > Will
> > > > > use
> > > > > > > dot
> > > > > > > > > > > > separated
> > > > > > > > > > > > > > > name
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> new
> > > > > > > > > > > > > > > > > > >> > > property. Replication quotas use
> dot
> > > > > > > separated,
> > > > > > > > so
> > > > > > > > > > it
> > > > > > > > > > > > will
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > >> consistent
> > > > > > > > > > > > > > > > > > >> > > with all properties except byte
> rate
> > > > > quotas.
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > > Radai: #1 Request processing time
> > > rather
> > > > > > than
> > > > > > > > > > request
> > > > > > > > > > > > rate
> > > > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > > > chosen
> > > > > > > > > > > > > > > > > > >> > > because the time per request can
> > vary
> > > > > > > > > significantly
> > > > > > > > > > > > > between
> > > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > >> > > mentioned in the discussion and
> KIP.
> > > > > > > > > > > > > > > > > > >> > > #2 Two separate quotas for
> > > > > > heartbeats/regular
> > > > > > > > > > requests
> > > > > > > > > > > > > feel
> > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > >> > > configuration and more metrics.
> > Since
> > > > most
> > > > > > > users
> > > > > > > > > > would
> > > > > > > > > > > > set
> > > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > > >> higher
> > > > > > > > > > > > > > > > > > >> > > than the expected usage and quotas
> > are
> > > > > more
> > > > > > > of a
> > > > > > > > > > > safety
> > > > > > > > > > > > > > net, a
> > > > > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > > > > > > > > >> > >  #3 The number of requests in
> > > purgatory
> > > > is
> > > > > > > > limited
> > > > > > > > > > by
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > number
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> > active
> > > > > > > > > > > > > > > > > > >> > > connections since only one request
> > per
> > > > > > > > connection
> > > > > > > > > > will
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > throttled
> > > > > > > > > > > > > > > > > > >> at a
> > > > > > > > > > > > > > > > > > >> > > time.
> > > > > > > > > > > > > > > > > > >> > > #4 As with byte rate quotas, to
> use
> > > the
> > > > > full
> > > > > > > > > > allocated
> > > > > > > > > > > > > > quotas,
> > > > > > > > > > > > > > > > > > >> > > clients/users would need to use
> > > > partitions
> > > > > > > that
> > > > > > > > > are
> > > > > > > > > > > > > > > distributed
> > > > > > > > > > > > > > > > > > across
> > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > >> > > cluster. The alternative of using
> > > > > > cluster-wide
> > > > > > > > > > quotas
> > > > > > > > > > > > > > instead
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> > per-broker
> > > > > > > > > > > > > > > > > > >> > > quotas would be far too complex to
> > > > > > implement.
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > > Dong : We currently have two
> > > > > > > ClientQuotaManagers
> > > > > > > > > for
> > > > > > > > > > > > quota
> > > > > > > > > > > > > > > types
> > > > > > > > > > > > > > > > > > Fetch
> > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > >> > > Produce. A new one will be added
> for
> > > > > > IOThread,
> > > > > > > > > which
> > > > > > > > > > > > > manages
> > > > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > > > >> > I/O
> > > > > > > > > > > > > > > > > > >> > > thread utilization. This will not
> > > update
> > > > > the
> > > > > > > > Fetch
> > > > > > > > > > or
> > > > > > > > > > > > > > Produce
> > > > > > > > > > > > > > > > > > >> queue-size,
> > > > > > > > > > > > > > > > > > >> > > but will have a separate metric
> for
> > > the
> > > > > > > > > > queue-size.  I
> > > > > > > > > > > > > > wasn't
> > > > > > > > > > > > > > > > > > >> planning to
> > > > > > > > > > > > > > > > > > >> > > add any additional metrics apart
> > from
> > > > the
> > > > > > > > > equivalent
> > > > > > > > > > > > ones
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > >> > > quotas as part of this KIP. Ratio
> of
> > > > > > byte-rate
> > > > > > > > to
> > > > > > > > > > I/O
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > >> utilization
> > > > > > > > > > > > > > > > > > >> > > could be slightly misleading since
> > it
> > > > > > depends
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > > > > > sequence
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> > requests.
> > > > > > > > > > > > > > > > > > >> > > But we can look into more metrics
> > > after
> > > > > the
> > > > > > > KIP
> > > > > > > > is
> > > > > > > > > > > > > > implemented
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > >> > required.
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > > I think we need to limit the
> maximum
> > > > delay
> > > > > > > since
> > > > > > > > > all
> > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > >> > > throttled. If a client has a quota
> > of
> > > > > 0.001
> > > > > > > > units
> > > > > > > > > > and
> > > > > > > > > > > a
> > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > >> > used
> > > > > > > > > > > > > > > > > > >> > > 50ms, we don't want to delay all
> > > > requests
> > > > > > from
> > > > > > > > the
> > > > > > > > > > > > client
> > > > > > > > > > > > > by
> > > > > > > > > > > > > > > 50
> > > > > > > > > > > > > > > > > > >> seconds,
> > > > > > > > > > > > > > > > > > >> > > throwing the client out of all its
> > > > > consumer
> > > > > > > > > groups.
> > > > > > > > > > > The
> > > > > > > > > > > > > > issue
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > >> if
> > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > >> > > user is allocated a quota that is
> > > > > > insufficient
> > > > > > > > to
> > > > > > > > > > > > process
> > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > large
> > > > > > > > > > > > > > > > > > >> > > request. The expectation is that
> the
> > > > units
> > > > > > > > > allocated
> > > > > > > > > > > per
> > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > >> > much
> > > > > > > > > > > > > > > > > > >> > > higher than the time taken to
> > process
> > > > one
> > > > > > > > request
> > > > > > > > > > and
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > > >> > > seldom be applied. Agree this
> needs
> > > > proper
> > > > > > > > > > > > documentation.
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM,
> > > radai <
> > > > > > > > > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > > >> > >> @jun: i wasnt concerned about
> tying
> > > up
> > > > a
> > > > > > > > request
> > > > > > > > > > > > > processing
> > > > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > > >> but
> > > > > > > > > > > > > > > > > > >> > >> IIUC the code does still read the
> > > > entire
> > > > > > > > request
> > > > > > > > > > out,
> > > > > > > > > > > > > which
> > > > > > > > > > > > > > > > might
> > > > > > > > > > > > > > > > > > >> add-up
> > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > >> > >> a non-negligible amount of
> memory.
> > > > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM,
> > > Dong
> > > > > Lin
> > > > > > <
> > > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > The current KIP says that the
> > > maximum
> > > > > > delay
> > > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > > > > reduced
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> window
> > > > > > > > > > > > > > > > > > >> > >> size
> > > > > > > > > > > > > > > > > > >> > >> > if it is larger than the window
> > > > size. I
> > > > > > > have
> > > > > > > > a
> > > > > > > > > > > > concern
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > this:
> > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > 1) This essentially means that
> > the
> > > > user
> > > > > > is
> > > > > > > > > > allowed
> > > > > > > > > > > to
> > > > > > > > > > > > > > > exceed
> > > > > > > > > > > > > > > > > > their
> > > > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > > > >> > >> > over a long period of time. Can
> > you
> > > > > > provide
> > > > > > > > an
> > > > > > > > > > > upper
> > > > > > > > > > > > > > bound
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > 2) What is the motivation for
> cap
> > > the
> > > > > > > maximum
> > > > > > > > > > delay
> > > > > > > > > > > > by
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > window
> > > > > > > > > > > > > > > > > > >> > size?
> > > > > > > > > > > > > > > > > > >> > >> I
> > > > > > > > > > > > > > > > > > >> > >> > am wondering if there is better
> > > > > > alternative
> > > > > > > > to
> > > > > > > > > > > > address
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > 3) It means that the existing
> > > > > > > metric-related
> > > > > > > > > > config
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > > > >> > >> > directly impact on the
> mechanism
> > of
> > > > > this
> > > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > > > >> quota.
> > > > > > > > > > > > > > > > > > >> > The
> > > > > > > > > > > > > > > > > > >> > >> > may be an important change
> > > depending
> > > > on
> > > > > > the
> > > > > > > > > > answer
> > > > > > > > > > > to
> > > > > > > > > > > > > 1)
> > > > > > > > > > > > > > > > above.
> > > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > > >> > >> probably
> > > > > > > > > > > > > > > > > > >> > >> > need to document this more
> > > > explicitly.
> > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56
> AM,
> > > > Dong
> > > > > > Lin
> > > > > > > <
> > > > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > > Yeah you are right. I thought
> > it
> > > > > wasn't
> > > > > > > > > because
> > > > > > > > > > > at
> > > > > > > > > > > > > > > LinkedIn
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > >> will
> > > > > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > > > >> > >> > > much pressure on inGraph to
> > > expose
> > > > > > those
> > > > > > > > > > > > per-clientId
> > > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > > >> > >> ended
> > > > > > > > > > > > > > > > > > >> > >> > > up printing them periodically
> > to
> > > > > local
> > > > > > > log.
> > > > > > > > > > Never
> > > > > > > > > > > > > mind
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> not
> > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > > - I agree with Jay that we
> > > probably
> > > > > > don't
> > > > > > > > > want
> > > > > > > > > > to
> > > > > > > > > > > > > add a
> > > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > > > > > > >> > >> > > every quota ProduceResponse
> or
> > > > > > > > FetchResponse.
> > > > > > > > > > Is
> > > > > > > > > > > > > there
> > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > >> use-case
> > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > >> > >> > > having separate throttle-time
> > > > fields
> > > > > > for
> > > > > > > > > > > > > > byte-rate-quota
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-quota? You
> > > probably
> > > > > need
> > > > > > > to
> > > > > > > > > > > document
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > >> > interface
> > > > > > > > > > > > > > > > > > >> > >> > > change if you plan to add new
> > > field
> > > > > in
> > > > > > > any
> > > > > > > > > > > request.
> > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > > - I don't think IOThread
> > belongs
> > > to
> > > > > > > > > quotaType.
> > > > > > > > > > > The
> > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > >> > >> types
> > > > > > > > > > > > > > > > > > >> > >> > > (i.e.
> > > > Produce/Fetch/LeaderReplicatio
> > > > > > > > > > > > > > > n/FollowerReplication)
> > > > > > > > > > > > > > > > > > >> identify
> > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > >> > >> > > type of request that are
> > > throttled,
> > > > > not
> > > > > > > the
> > > > > > > > > > quota
> > > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > > - If a request is throttled
> due
> > > to
> > > > > this
> > > > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > > > >> quota,
> > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > > existing queue-size metric in
> > > > > > > > > > ClientQuotaManager
> > > > > > > > > > > > > > > > incremented?
> > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > > - In the interest of
> providing
> > > > guide
> > > > > > line
> > > > > > > > for
> > > > > > > > > > > admin
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > decide
> > > > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-based quota
> and
> > > for
> > > > > user
> > > > > > > to
> > > > > > > > > > > > understand
> > > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > > impact
> > > > > > > > > > > > > > > > > > >> on
> > > > > > > > > > > > > > > > > > >> > >> their
> > > > > > > > > > > > > > > > > > >> > >> > > traffic, would it be useful
> to
> > > > have a
> > > > > > > > metric
> > > > > > > > > > that
> > > > > > > > > > > > > shows
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> overall
> > > > > > > > > > > > > > > > > > >> > >> > > byte-rate per io-thread-unit?
> > Can
> > > > we
> > > > > > also
> > > > > > > > > show
> > > > > > > > > > > > this a
> > > > > > > > > > > > > > > > > > >> per-clientId
> > > > > > > > > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25
> > AM,
> > > > Jun
> > > > > > Rao
> > > > > > > <
> > > > > > > > > > > > > > > jun@confluent.io
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> > >> For #3, typically, an admin
> > > won't
> > > > > > > > configure
> > > > > > > > > > more
> > > > > > > > > > > > io
> > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > than
> > > > > > > > > > > > > > > > > > >> > CPU
> > > > > > > > > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > > > > > > > > >> > >> > >> but it's possible for an
> admin
> > > to
> > > > > > start
> > > > > > > > with
> > > > > > > > > > > fewer
> > > > > > > > > > > > > io
> > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > > > > > >> > >> cores
> > > > > > > > > > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> > >> I think the throttleTime
> > sensor
> > > on
> > > > > the
> > > > > > > > > broker
> > > > > > > > > > > > tells
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > > > > > > > > >> > >> > >> user/clentId is throttled or
> > > not.
> > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> > >> The reasoning for delaying
> the
> > > > > > throttled
> > > > > > > > > > > requests
> > > > > > > > > > > > on
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > > >> > >> instead
> > > > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > > > >> > >> > >> returning an error
> immediately
> > > is
> > > > > that
> > > > > > > the
> > > > > > > > > > > latter
> > > > > > > > > > > > > has
> > > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > >> > >> > >> client from retrying
> > > immediately,
> > > > > > which
> > > > > > > > will
> > > > > > > > > > > make
> > > > > > > > > > > > > > things
> > > > > > > > > > > > > > > > > > worse.
> > > > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > > > >> > >> > >> delaying logic is based off
> a
> > > > delay
> > > > > > > > queue. A
> > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > > expiration
> > > > > > > > > > > > > > > > > > >> > >> thread
> > > > > > > > > > > > > > > > > > >> > >> > >> just waits on the next to be
> > > > expired
> > > > > > > > > request.
> > > > > > > > > > > So,
> > > > > > > > > > > > it
> > > > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > > tie
> > > > > > > > > > > > > > > > > > >> > up a
> > > > > > > > > > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07
> > AM,
> > > > > > Ismael
> > > > > > > > > Juma <
> > > > > > > > > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > >> > Regarding 1, I definitely
> > like
> > > > the
> > > > > > > > > > simplicity
> > > > > > > > > > > of
> > > > > > > > > > > > > > > > keeping a
> > > > > > > > > > > > > > > > > > >> single
> > > > > > > > > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > > > > > > > > >> > >> > >> > time field in the
> response.
> > > The
> > > > > > > downside
> > > > > > > > > is
> > > > > > > > > > > that
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > > > >> > metrics
> > > > > > > > > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > > > > > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > > > > > > > > percentage`
> > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > >> > >> > >> >
> `log.cleaner.min.cleanable.
> > > > > ratio`.
> > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at
> 4:43
> > > PM,
> > > > > Jay
> > > > > > > > > Kreps <
> > > > > > > > > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    1. Isn't it the case
> > that
> > > > the
> > > > > > > > > > throttling
> > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > > > >> > >> > should
> > > > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    the total time your
> > > request
> > > > > was
> > > > > > > > > > throttled
> > > > > > > > > > > > > > > > > irrespective
> > > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    caused that. Limiting
> > it
> > > to
> > > > > > byte
> > > > > > > > rate
> > > > > > > > > > > quota
> > > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    I don't think we want
> > to
> > > > end
> > > > > up
> > > > > > > > > adding
> > > > > > > > > > > new
> > > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> response
> > > > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    single thing we
> quota,
> > > > right?
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    2. I don't think we
> > > should
> > > > > make
> > > > > > > > this
> > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > specifically
> > > > > > > > > > > > > > > > > > >> > about
> > > > > > > > > > > > > > > > > > >> > >> io
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    threads. Once we
> > > introduce
> > > > > > these
> > > > > > > > > quotas
> > > > > > > > > > > > > people
> > > > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    be enforced (and if
> > they
> > > > > aren't
> > > > > > > it
> > > > > > > > > may
> > > > > > > > > > > > cause
> > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > outage).
> > > > > > > > > > > > > > > > > > >> As
> > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    are a bit more
> > sensitive
> > > > than
> > > > > > > > normal
> > > > > > > > > > > > > configs, I
> > > > > > > > > > > > > > > > > think.
> > > > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > > > >> > >> > current
> > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    pools seem like
> > something
> > > > of
> > > > > an
> > > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > > detail
> > > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    user-facing quotas
> > should
> > > > be
> > > > > > > > involved
> > > > > > > > > > > > with. I
> > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > >> might
> > > > > > > > > > > > > > > > > > >> > >> be
> > > > > > > > > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    make this a general
> > > > > > request-time
> > > > > > > > > > throttle
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > >> mention in
> > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    about I/O threads and
> > > > simply
> > > > > > > > > > acknowledge
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    may someday fix) in
> the
> > > > docs
> > > > > > that
> > > > > > > > > this
> > > > > > > > > > > > covers
> > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > > >> > >> after
> > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    thread is read off
> the
> > > > > network.
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    3. As such I think
> the
> > > > right
> > > > > > > > > interface
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    like
> > percent_request_time
> > > > and
> > > > > > be
> > > > > > > in
> > > > > > > > > > > > > {0,...100}
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I
> > think
> > > > > > "ratio"
> > > > > > > > is
> > > > > > > > > > the
> > > > > > > > > > > > > > > > terminology
> > > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > >> used
> > > > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > > > > > > > > >> > >> > >> > >    is between 0 and 1 in
> > the
> > > > > other
> > > > > > > > > > metrics,
> > > > > > > > > > > > > > right?)
> > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at
> > 3:45
> > > > AM,
> > > > > > > > Rajini
> > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > Thank you for the
> > > feedback.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang : I have
> > updated
> > > > the
> > > > > > > > section
> > > > > > > > > on
> > > > > > > > > > > > > > > > co-existence
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> byte
> > > > > > > > > > > > > > > > > > >> > >> rate
> > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added
> > much
> > > > > detail
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> sensors
> > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > going to be very
> similar
> > > to
> > > > > the
> > > > > > > > > existing
> > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> sensors.
> > > > > > > > > > > > > > > > > > >> > >> To
> > > > > > > > > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > confusion, I have now
> > > added
> > > > > more
> > > > > > > > > detail.
> > > > > > > > > > > All
> > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > "quotaType" and all
> > > sensors
> > > > > have
> > > > > > > > names
> > > > > > > > > > > > > starting
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > quotaType is
> > > Produce/Fetch/
> > > > > > > > > > > > LeaderReplication/
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > FollowerReplication/*IOThread*
> > > > > > ).
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > So there will be no
> > reuse
> > > of
> > > > > > > > existing
> > > > > > > > > > > > > > > > metrics/sensors.
> > > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > >> > new
> > > > > > > > > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > request processing
> time
> > > > based
> > > > > > > > > throttling
> > > > > > > > > > > > will
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > >> completely
> > > > > > > > > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > existing
> > metrics/sensors,
> > > > but
> > > > > > will
> > > > > > > > be
> > > > > > > > > > > > > consistent
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > >> format.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > The existing
> > > > throttle_time_ms
> > > > > > > field
> > > > > > > > in
> > > > > > > > > > > > > > > produce/fetch
> > > > > > > > > > > > > > > > > > >> > responses
> > > > > > > > > > > > > > > > > > >> > >> > will
> > > > > > > > > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > impacted by this KIP.
> > That
> > > > > will
> > > > > > > > > continue
> > > > > > > > > > > to
> > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > > >> byte-rate
> > > > > > > > > > > > > > > > > > >> > >> based
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > throttling times. In
> > > > > addition, a
> > > > > > > new
> > > > > > > > > > field
> > > > > > > > > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > added to return
> request
> > > > quota
> > > > > > > based
> > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > > times.
> > > > > > > > > > > > > > > > > > >> These
> > > > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > as new metrics on the
> > > > > > client-side.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > Since all metrics and
> > > > sensors
> > > > > > are
> > > > > > > > > > > different
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > believe there is
> already
> > > > > > > sufficient
> > > > > > > > > > > metrics
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > monitor
> > > > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > client and broker side
> > for
> > > > > each
> > > > > > > type
> > > > > > > > > of
> > > > > > > > > > > > > > > throttling.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017
> at
> > > 4:32
> > > > > AM,
> > > > > > > > Dong
> > > > > > > > > > Lin
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > I think it makes a
> lot
> > > of
> > > > > > sense
> > > > > > > to
> > > > > > > > > use
> > > > > > > > > > > > > > > > > io_thread_units
> > > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > > >> > >> metric
> > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > user's traffic here.
> > > LGTM
> > > > > > > > overall. I
> > > > > > > > > > > have
> > > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > > questions
> > > > > > > > > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Can you be more
> > > specific
> > > > > in
> > > > > > > the
> > > > > > > > > KIP
> > > > > > > > > > > what
> > > > > > > > > > > > > > > sensors
> > > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > example, it will be
> > > useful
> > > > > to
> > > > > > > > > specify
> > > > > > > > > > > the
> > > > > > > > > > > > > name
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - We currently have
> > > > > > > throttle-time
> > > > > > > > > and
> > > > > > > > > > > > > > queue-size
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Are you going to
> have
> > > > > separate
> > > > > > > > > > > > throttle-time
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> queue-size
> > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > throttled by
> > > > > > > io_thread_unit-based
> > > > > > > > > > quota,
> > > > > > > > > > > > or
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > > >> share
> > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Does the
> > throttle-time
> > > > in
> > > > > > the
> > > > > > > > > > > > > > ProduceResponse
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > time due to
> > > > > > io_thread_unit-based
> > > > > > > > > > quota?
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Currently kafka
> > server
> > > > > > doesn't
> > > > > > > > not
> > > > > > > > > > > > provide
> > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > log
> > > > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > whether any given
> > > clientId
> > > > > (or
> > > > > > > > user)
> > > > > > > > > > is
> > > > > > > > > > > > > > > throttled.
> > > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > because we can still
> > > check
> > > > > the
> > > > > > > > > > > client-side
> > > > > > > > > > > > > > > > byte-rate
> > > > > > > > > > > > > > > > > > >> metric
> > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > whether a given
> client
> > > is
> > > > > > > > throttled.
> > > > > > > > > > But
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > will be no way to
> > > validate
> > > > > > > > whether a
> > > > > > > > > > > given
> > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> slow
> > > > > > > > > > > > > > > > > > >> > >> > because
> > > > > > > > > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > exceeded its
> > > > io_thread_unit
> > > > > > > limit.
> > > > > > > > > It
> > > > > > > > > > is
> > > > > > > > > > > > > > > necessary
> > > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> user
> > > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > know this
> information
> > to
> > > > > > figure
> > > > > > > > how
> > > > > > > > > > > > whether
> > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > >> > reached
> > > > > > > > > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > limit. How about we
> > add
> > > > > log4j
> > > > > > > log
> > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > > > > > server
> > > > > > > > > > > > > > > > side
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > > > > > > > > > byte-rate-throttle-time,
> > > > > > > > > > > > > > > > > > >> > >> > >> >
> > io-thread-unit-throttle-time)
> > > > > > > > > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > that kafka
> > administrator
> > > > can
> > > > > > > > figure
> > > > > > > > > > > those
> > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> have
> > > > > > > > > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > limit and act
> > > accordingly?
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017
> > at
> > > > 4:46
> > > > > > PM,
> > > > > > > > > > > Guozhang
> > > > > > > > > > > > > > Wang <
> > > > > > > > > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Made a pass over
> the
> > > > doc,
> > > > > > > > overall
> > > > > > > > > > LGTM
> > > > > > > > > > > > > > except
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > minor
> > > > > > > > > > > > > > > > > > >> > >> comment
> > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttling
> > > > implementation:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Stated as "Request
> > > > > > processing
> > > > > > > > time
> > > > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > necessary." I
> > thought
> > > > that
> > > > > > it
> > > > > > > > > meant
> > > > > > > > > > > the
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > >> > processing
> > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > is applied first,
> > but
> > > > > > continue
> > > > > > > > > > > reading I
> > > > > > > > > > > > > > found
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > >> > actually
> > > > > > > > > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > produce / fetch
> byte
> > > > rate
> > > > > > > > > throttling
> > > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Also the last
> > sentence
> > > > > "The
> > > > > > > > > > remaining
> > > > > > > > > > > > > delay
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > response." is a
> bit
> > > > > > confusing
> > > > > > > to
> > > > > > > > > me.
> > > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > > rewording
> > > > > > > > > > > > > > > > > > >> it a
> > > > > > > > > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22,
> 2017
> > > at
> > > > > 3:24
> > > > > > > PM,
> > > > > > > > > Jun
> > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Thanks for the
> > > updated
> > > > > > KIP.
> > > > > > > > The
> > > > > > > > > > > latest
> > > > > > > > > > > > > > > > proposal
> > > > > > > > > > > > > > > > > > >> looks
> > > > > > > > > > > > > > > > > > >> > >> good
> > > > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22,
> > 2017
> > > > at
> > > > > > 2:19
> > > > > > > > PM,
> > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Thank you for
> > the
> > > > > > > feedback.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 1. I have
> > updated
> > > > the
> > > > > > KIP
> > > > > > > to
> > > > > > > > > use
> > > > > > > > > > > > > > absolute
> > > > > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property is
> > > called*
> > > > > > > > > > > io_thread_units*
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > align
> > > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> > > > > > *num.io.threads*.
> > > > > > > > > When
> > > > > > > > > > we
> > > > > > > > > > > > > > > implement
> > > > > > > > > > > > > > > > > > >> network
> > > > > > > > > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas, we can
> > add
> > > > > > another
> > > > > > > > > > > property
> > > > > > > > > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 2.
> > > > ControlledShutdown
> > > > > is
> > > > > > > > > already
> > > > > > > > > > > > > listed
> > > > > > > > > > > > > > > > under
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > you mean a
> > > different
> > > > > > > request
> > > > > > > > > > that
> > > > > > > > > > > > > needs
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > >> added?
> > > > > > > > > > > > > > > > > > >> > >> The
> > > > > > > > > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > currently
> exempt
> > > in
> > > > > the
> > > > > > > KIP
> > > > > > > > > are
> > > > > > > > > > > > > > > StopReplica,
> > > > > > > > > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr
> and
> > > > > > > > > UpdateMetadata.
> > > > > > > > > > > > These
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > >> controlled
> > > > > > > > > > > > > > > > > > >> > >> > using
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is
> > easy
> > > > to
> > > > > > > > exclude
> > > > > > > > > > and
> > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > sure if there
> > are
> > > > > other
> > > > > > > > > requests
> > > > > > > > > > > > used
> > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was
> > thinking
> > > > the
> > > > > > > > smallest
> > > > > > > > > > > > change
> > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> replace
> > > > > > > > > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > *requestChannel.sendResponse()
> > > > > > > > > *
> > > > > > > > > > > > with
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > local
> > > > > > > > > > > > > > > > > > >> method
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > *sendResponseMaybeThrottle()*
> > > > > > > > > > that
> > > > > > > > > > > > > does
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > throttling
> > > > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > response. If
> we
> > > > > throttle
> > > > > > > > first
> > > > > > > > > > in
> > > > > > > > > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > within the
> > method
> > > > > > handling
> > > > > > > > the
> > > > > > > > > > > > request
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > throttling. We
> > can
> > > > > look
> > > > > > > into
> > > > > > > > > > this
> > > > > > > > > > > > > again
> > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> PR
> > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb
> 22,
> > > 2017
> > > > > at
> > > > > > > 5:55
> > > > > > > > > PM,
> > > > > > > > > > > > Roger
> > > > > > > > > > > > > > > > Hoover
> > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > roger.hoover@gmail.com>
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Great to see
> > > this
> > > > > KIP
> > > > > > > and
> > > > > > > > > the
> > > > > > > > > > > > > > excellent
> > > > > > > > > > > > > > > > > > >> discussion.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's
> > > > > > suggestion
> > > > > > > > > makes
> > > > > > > > > > > > sense.
> > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > my
> > > > > > > > > > > > > > > > > > >> > >> application
> > > > > > > > > > > > > > > > > > >> > >> > is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > handler
> > > > > unit,
> > > > > > > then
> > > > > > > > > > it's
> > > > > > > > > > > as
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > > > >> > Kafka
> > > > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > handler
> > > > > thread
> > > > > > > > > > dedicated
> > > > > > > > > > > > to
> > > > > > > > > > > > > > me.
> > > > > > > > > > > > > > > > > > That's
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > least.  That
> > > > > > allocation
> > > > > > > > > > doesn't
> > > > > > > > > > > > > change
> > > > > > > > > > > > > > > > even
> > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > >> an
> > > > > > > > > > > > > > > > > > >> > >> admin
> > > > > > > > > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > size of the
> > > > request
> > > > > > > thread
> > > > > > > > > > pool
> > > > > > > > > > > on
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > broker.
> > > > > > > > > > > > > > > > > > >> > It's
> > > > > > > > > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > abstraction
> > that
> > > > VMs
> > > > > > and
> > > > > > > > > > > > containers
> > > > > > > > > > > > > > get
> > > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > While
> > different
> > > > > client
> > > > > > > > > access
> > > > > > > > > > > > > patterns
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > >> > wildly
> > > > > > > > > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> thread
> > > > > > resources
> > > > > > > > per
> > > > > > > > > > > > > request,
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > given
> > > > > > > > > > > > > > > > > > >> > >> > application
> > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > have a
> stable
> > > > access
> > > > > > > > pattern
> > > > > > > > > > and
> > > > > > > > > > > > can
> > > > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > "request
> > thread
> > > > > units"
> > > > > > > it
> > > > > > > > > > needs
> > > > > > > > > > > to
> > > > > > > > > > > > > > meet
> > > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb
> > 22,
> > > > 2017
> > > > > > at
> > > > > > > > 8:53
> > > > > > > > > > AM,
> > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi,
> Rajini,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for
> > the
> > > > > > updated
> > > > > > > > > KIP.
> > > > > > > > > > A
> > > > > > > > > > > > few
> > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > >> comments.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A
> concern
> > > of
> > > > > > > > > > > > > request_time_percent
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> it's
> > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > >> > >> > an
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Let's say
> > you
> > > > > give a
> > > > > > > > user
> > > > > > > > > a
> > > > > > > > > > > 10%
> > > > > > > > > > > > > > limit.
> > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > admin
> > > > > > > > > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > handler
> > > > > > > threads,
> > > > > > > > > > that
> > > > > > > > > > > > user
> > > > > > > > > > > > > > now
> > > > > > > > > > > > > > > > > > >> actually
> > > > > > > > > > > > > > > > > > >> > has
> > > > > > > > > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > capacity.
> > This
> > > > may
> > > > > > > > confuse
> > > > > > > > > > > > people
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > bit.
> > > > > > > > > > > > > > > > > So,
> > > > > > > > > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > based on
> an
> > > > > absolute
> > > > > > > > > request
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > unit
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> > better.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > > > > > > > > ControlledShutdownRequest
> > > > > > > > > > > is
> > > > > > > > > > > > > also
> > > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> excluded
> > > from
> > > > > > > > > throttling.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 3.
> > > > Implementation
> > > > > > > wise,
> > > > > > > > I
> > > > > > > > > am
> > > > > > > > > > > > > > wondering
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > time
> > > throttling
> > > > > > first
> > > > > > > in
> > > > > > > > > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > throttling
> > > > > logic
> > > > > > > in
> > > > > > > > > each
> > > > > > > > > > > > type
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > > request.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed,
> Feb
> > > 22,
> > > > > 2017
> > > > > > > at
> > > > > > > > > 5:58
> > > > > > > > > > > AM,
> > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank
> you
> > > for
> > > > > the
> > > > > > > > > review.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I have
> > > > reverted
> > > > > to
> > > > > > > the
> > > > > > > > > > > > original
> > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > utilization.
> > > > At
> > > > > > the
> > > > > > > > > > moment,
> > > > > > > > > > > it
> > > > > > > > > > > > > > uses
> > > > > > > > > > > > > > > > > > >> percentage,
> > > > > > > > > > > > > > > > > > >> > >> but
> > > > > > > > > > > > > > > > > > >> > >> > I
> > > > > > > > > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > a
> fraction
> > > > (out
> > > > > > of 1
> > > > > > > > > > instead
> > > > > > > > > > > > of
> > > > > > > > > > > > > > 100)
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > from
> this
> > > > > > discussion
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > KIP.
> > > > > > > > > > > > > > > Also
> > > > > > > > > > > > > > > > > > added
> > > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > address
> > > > network
> > > > > > > thread
> > > > > > > > > > > > > > utilization.
> > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > "request_time_percent"
> > > > > > > > > > with
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > expectation
> > > > > > > > > > > > > > > > > > >> > that
> > > > > > > > > > > > > > > > > > >> > >> it
> > > > > > > > > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> for
> > > > > network
> > > > > > > > thread
> > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > users
> have
> > > to
> > > > > set
> > > > > > > only
> > > > > > > > > one
> > > > > > > > > > > > > config
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > > >> > >> > not
> > > > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > internal
> > > > > > > > > distribution
> > > > > > > > > > of
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > work
> > > > > > > > > > > > > > > > > > >> between
> > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > two
> > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed,
> > Feb
> > > > 22,
> > > > > > 2017
> > > > > > > > at
> > > > > > > > > > > 12:23
> > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > Rao
> > > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi,
> > > Rajini,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks
> > for
> > > > the
> > > > > > > > > proposal.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > The
> > > benefit
> > > > of
> > > > > > > using
> > > > > > > > > the
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > >> processing
> > > > > > > > > > > > > > > > > > >> > >> time
> > > > > > > > > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> exactly
> > > what
> > > > > > > people
> > > > > > > > > have
> > > > > > > > > > > > > said. I
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > > >> > >> expand
> > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > following
> > > > > case.
> > > > > > > The
> > > > > > > > > > > producer
> > > > > > > > > > > > > > > sends a
> > > > > > > > > > > > > > > > > > >> produce
> > > > > > > > > > > > > > > > > > >> > >> > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > but
> > > > compressed
> > > > > > to
> > > > > > > > > 100KB
> > > > > > > > > > > with
> > > > > > > > > > > > > > gzip.
> > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > broker
> > > could
> > > > > > take
> > > > > > > > > 10-15
> > > > > > > > > > > > > seconds,
> > > > > > > > > > > > > > > > > during
> > > > > > > > > > > > > > > > > > >> which
> > > > > > > > > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > thread
> > is
> > > > > > > completely
> > > > > > > > > > > > blocked.
> > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > >> case,
> > > > > > > > > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > request
> > > > > rate
> > > > > > > > quota
> > > > > > > > > > may
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> another
> > > > case.
> > > > > A
> > > > > > > > > consumer
> > > > > > > > > > > > group
> > > > > > > > > > > > > > > > starts
> > > > > > > > > > > > > > > > > > >> with 10
> > > > > > > > > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> switches
> > > to
> > > > 20
> > > > > > > > > > instances.
> > > > > > > > > > > > The
> > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > > >> > will
> > > > > > > > > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> actually
> > > > load
> > > > > on
> > > > > > > the
> > > > > > > > > > > broker
> > > > > > > > > > > > > may
> > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > double
> > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> contains
> > > > half
> > > > > of
> > > > > > > the
> > > > > > > > > > > > > partitions.
> > > > > > > > > > > > > > > > > Request
> > > > > > > > > > > > > > > > > > >> rate
> > > > > > > > > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > configure
> > > in
> > > > > > this
> > > > > > > > > case.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > What
> we
> > > > really
> > > > > > > want
> > > > > > > > is
> > > > > > > > > > to
> > > > > > > > > > > be
> > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > of the
> > > > server
> > > > > > side
> > > > > > > > > > > > resources.
> > > > > > > > > > > > > In
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > >> > >> particular
> > > > > > > > > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> capacity
> > > of
> > > > > the
> > > > > > > > > request
> > > > > > > > > > > > > handler
> > > > > > > > > > > > > > > > > > threads. I
> > > > > > > > > > > > > > > > > > >> > >> agree
> > > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > intuitive
> > > > for
> > > > > > the
> > > > > > > > > users
> > > > > > > > > > to
> > > > > > > > > > > > > > > determine
> > > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > >> > set
> > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > this
> is
> > > not
> > > > > > > > completely
> > > > > > > > > > new
> > > > > > > > > > > > and
> > > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > > been
> > > > > > > > > > > > > > > > > > >> done
> > > > > > > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> already.
> > > For
> > > > > > > > example,
> > > > > > > > > > > Linux
> > > > > > > > > > > > > > > cgroup (
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > https://access.redhat.com/
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > Resource_Management_Guide/sec-
> > > > > > > > > > > > > > > > > cpu.html)
> > > > > > > > > > > > > > > > > > >> has
> > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > cpu.cfs_quota_us,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > which
> > > > > specifies
> > > > > > > the
> > > > > > > > > > total
> > > > > > > > > > > > > amount
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks
> > in a
> > > > > > cgroup
> > > > > > > > can
> > > > > > > > > > run
> > > > > > > > > > > > > > during a
> > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > >> second
> > > > > > > > > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> potentially
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > model
> > the
> > > > > > request
> > > > > > > > > > handler
> > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > > > > >> > similar
> > > > > > > > > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> request
> > > > > handler
> > > > > > > > thread
> > > > > > > > > > can
> > > > > > > > > > > > be
> > > > > > > > > > > > > 1
> > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > >> > handler
> > > > > > > > > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > configure
> > > a
> > > > > > limit
> > > > > > > on
> > > > > > > > > how
> > > > > > > > > > > > many
> > > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > > (say
> > > > > > > > > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > > > > > > > > >> > >> a
> > > > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > Regarding
> > > > not
> > > > > > > > > throttling
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > internal
> > > > > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > do
> that.
> > > > > > > > > Alternatively,
> > > > > > > > > > we
> > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > > let
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> the
> > > > kafka
> > > > > > user
> > > > > > > > (it
> > > > > > > > > > may
> > > > > > > > > > > > not
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> do
> > > > > > > > > > > > > > > > > > >> > >> that
> > > > > > > > > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> though).
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> Ideally
> > we
> > > > > want
> > > > > > to
> > > > > > > > be
> > > > > > > > > > able
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > protect
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool
> > too.
> > > > The
> > > > > > > > > difficult
> > > > > > > > > > is
> > > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > >> Rajini
> > > > > > > > > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > throttling
> > > > the
> > > > > > > > > requests
> > > > > > > > > > is
> > > > > > > > > > > > > > through
> > > > > > > > > > > > > > > > > > >> Purgatory
> > > > > > > > > > > > > > > > > > >> > >> and
> > > > > > > > > > > > > > > > > > >> > >> > we
> > > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> through
> > > how
> > > > to
> > > > > > > > > integrate
> > > > > > > > > > > > that
> > > > > > > > > > > > > > into
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > network
> > > > > > > > > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > layer,
> > > > > currently
> > > > > > > we
> > > > > > > > > know
> > > > > > > > > > > the
> > > > > > > > > > > > > > user,
> > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a
> > bit
> > > > > > tricky
> > > > > > > to
> > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > based
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > >> > clientId
> > > > > > > > > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > can
> > > > > > already
> > > > > > > > > > protect
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > >> thread
> > > > > > > > > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > requests.
> > > > So,
> > > > > if
> > > > > > > we
> > > > > > > > > > can't
> > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > >> > part
> > > > > > > > > > > > > > > > > > >> > >> > right
> > > > > > > > > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > request
> > > > > > > handling
> > > > > > > > > > > threads
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> Thanks,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > On
> Tue,
> > > Feb
> > > > > 21,
> > > > > > > 2017
> > > > > > > > > at
> > > > > > > > > > > 4:27
> > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> Thank
> > > you
> > > > > all
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > > > > feedback.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> Jay: I
> > > > have
> > > > > > > > removed
> > > > > > > > > > > > > exemption
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> consumer
> > > > > > > > > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > protecting
> > > > > the
> > > > > > > > > cluster
> > > > > > > > > > > is
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > important
> > > > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have
> > > > > retained
> > > > > > > the
> > > > > > > > > > > > exemption
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > >
> StopReplicat/LeaderAndIsr
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > throttled
> > > > > only
> > > > > > > if
> > > > > > > > > > > > > > authorization
> > > > > > > > > > > > > > > > > fails
> > > > > > > > > > > > > > > > > > >> (so
> > > > > > > > > > > > > > > > > > >> > >> can't
> > > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a
> > secure
> > > > > > > cluster,
> > > > > > > > > but
> > > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > > >> inter-broker
> > > > > > > > > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > delays).
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I
> will
> > > > wait
> > > > > > > > another
> > > > > > > > > > day
> > > > > > > > > > > to
> > > > > > > > > > > > > see
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > these
> > > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > > >> > >> any
> > > > > > > > > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > request
> > > > > > > processing
> > > > > > > > > > time
> > > > > > > > > > > > (as
> > > > > > > > > > > > > > > > opposed
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> request
> > > > > > > > > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > objections,
> > > > > I
> > > > > > > will
> > > > > > > > > > > revert
> > > > > > > > > > > > to
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> original
> > > > > > > > > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > The
> > > > original
> > > > > > > > > proposal
> > > > > > > > > > > was
> > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > > including
> > > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > handler
> > > > > > threads
> > > > > > > > > (that
> > > > > > > > > > > made
> > > > > > > > > > > > > > > > > calculation
> > > > > > > > > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > include
> > > > the
> > > > > > time
> > > > > > > > > spent
> > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > significant.
> > > > > > As
> > > > > > > > Jay
> > > > > > > > > > > > pointed
> > > > > > > > > > > > > > out,
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> total
> > > > > > available
> > > > > > > > CPU
> > > > > > > > > > time
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > convert
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> threads
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and
> > *n*
> > > > > > network
> > > > > > > > > > threads.
> > > > > > > > > > > > > > > > > > >> > >> > >> >
> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we
> > want,
> > > > but
> > > > > > it
> > > > > > > > can
> > > > > > > > > be
> > > > > > > > > > > > very
> > > > > > > > > > > > > > > > > expensive
> > > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > >> > some
> > > > > > > > > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > Guozhang
> > > > > have
> > > > > > > > > pointed
> > > > > > > > > > > out,
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > > >> > several
> > > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > generating
> > > > > > > metrics
> > > > > > > > > > that
> > > > > > > > > > > we
> > > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > > use,
> > > > > > > > > > > > > > > > > > >> > though
> > > > > > > > > > > > > > > > > > >> > >> we
> > > > > > > > > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > nanoTime()
> > > > > > > instead
> > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> small
> > > > > requests
> > > > > > > may
> > > > > > > > > be
> > > > > > > > > > <
> > > > > > > > > > > > 1ms.
> > > > > > > > > > > > > > But
> > > > > > > > > > > > > > > > > > rather
> > > > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > > > >> > >> add
> > > > > > > > > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> thread
> > > and
> > > > > > > network
> > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > wouldn't
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > > >> > >> better
> > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on
> > each
> > > > > thread
> > > > > > > > into
> > > > > > > > > a
> > > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > ratio?
> > > > > > > > > > > > > > > > > > >> UserA
> > > > > > > > > > > > > > > > > > >> > >> has
> > > > > > > > > > > > > > > > > > >> > >> > a
> > > > > > > > > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we
> > take
> > > > that
> > > > > > to
> > > > > > > > mean
> > > > > > > > > > > that
> > > > > > > > > > > > > > UserA
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > > >> 5%
> > > > > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and
> 5%
> > > of
> > > > > the
> > > > > > > time
> > > > > > > > > on
> > > > > > > > > > > I/O
> > > > > > > > > > > > > > > threads?
> > > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > > >> > either
> > > > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > throttled
> > > > -
> > > > > it
> > > > > > > > would
> > > > > > > > > > > mean
> > > > > > > > > > > > > > > > > maintaining
> > > > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > > > >> > >> sets
> > > > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > durations,
> > > > > but
> > > > > > > > would
> > > > > > > > > > > > result
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> quota
> > > > limits
> > > > > > > > (UserA
> > > > > > > > > > has
> > > > > > > > > > > 5%
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > >> > threads
> > > > > > > > > > > > > > > > > > >> > >> > and
> > > > > > > > > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > threads),
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > but
> > that
> > > > > seems
> > > > > > > > > > > unnecessary
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > harder
> > > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> explain
> > > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back
> > to
> > > > why
> > > > > > and
> > > > > > > > how
> > > > > > > > > > > quotas
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > applied
> > > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > > >> > >> > network
> > > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a)
> In
> > > the
> > > > > case
> > > > > > > of
> > > > > > > > > > fetch,
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > >> spent in
> > > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > significant
> > > > > > and
> > > > > > > I
> > > > > > > > > can
> > > > > > > > > > > see
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > include
> > > > > > > > > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > requests
> > > > > where
> > > > > > > the
> > > > > > > > > > > network
> > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of
> > > fetch,
> > > > > > > request
> > > > > > > > > > > handler
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > >> > utilization
> > > > > > > > > > > > > > > > > > >> > >> > would
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > high
> > > > request
> > > > > > > rate,
> > > > > > > > > low
> > > > > > > > > > > > data
> > > > > > > > > > > > > > > volume
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> > fetch
> > > > > > > > > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> throttle
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > clients
> > > > with
> > > > > > > high
> > > > > > > > > data
> > > > > > > > > > > > > volume.
> > > > > > > > > > > > > > > > > Network
> > > > > > > > > > > > > > > > > > >> > thread
> > > > > > > > > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > proportional
> > > > > > to
> > > > > > > > the
> > > > > > > > > > data
> > > > > > > > > > > > > > > volume. I
> > > > > > > > > > > > > > > > > am
> > > > > > > > > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> throttle
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> based
> > on
> > > > > > network
> > > > > > > > > > thread
> > > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > > >> > >> whether
> > > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > this
> > > case.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b)
> At
> > > the
> > > > > > > moment,
> > > > > > > > we
> > > > > > > > > > > > record
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > > > >> > >> quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a
> > > quota
> > > > > is
> > > > > > > > > > violated,
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk
> > > reads
> > > > > for
> > > > > > > > > fetches
> > > > > > > > > > > > > > happening
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> network
> > > > > > > > > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> delay
> > a
> > > > > > response
> > > > > > > > > after
> > > > > > > > > > > the
> > > > > > > > > > > > > > disk
> > > > > > > > > > > > > > > > > reads.
> > > > > > > > > > > > > > > > > > >> We
> > > > > > > > > > > > > > > > > > >> > >> could
> > > > > > > > > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > the
> > > > network
> > > > > > > thread
> > > > > > > > > > when
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> > >> complete
> > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > handling a
> > > > > > > > > subsequent
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > > (separate
> > > > > > > > > > > > > > > > > > >> out
> > > > > > > > > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> violation
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > handling
> > > > in
> > > > > > the
> > > > > > > > case
> > > > > > > > > > of
> > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > Regards,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> Rajini
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On
> > Tue,
> > > > Feb
> > > > > > 21,
> > > > > > > > 2017
> > > > > > > > > > at
> > > > > > > > > > > > 2:58
> > > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > > Becket
> > > > > > > > > > > > > > > > > > >> > Qin <
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > becket.qin@gmail.com>
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> Hey
> > > Jay,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > Yeah,
> > > I
> > > > > > agree
> > > > > > > > that
> > > > > > > > > > > > > enforcing
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > > >> > >> is a
> > > > > > > > > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > thinking
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> that
> > > > maybe
> > > > > > we
> > > > > > > > can
> > > > > > > > > > use
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > > >> > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> very
> > > > > > detailed
> > > > > > > so
> > > > > > > > > we
> > > > > > > > > > > can
> > > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > something
> > > > > > like
> > > > > > > > > > > > > (total_time -
> > > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > request/response_queue_time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > remote_time).
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I
> > > agree
> > > > > with
> > > > > > > > > > Guozhang
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > > >> user is
> > > > > > > > > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> need
> > > to
> > > > > see
> > > > > > if
> > > > > > > > > > > anything
> > > > > > > > > > > > > has
> > > > > > > > > > > > > > > went
> > > > > > > > > > > > > > > > > > wrong
> > > > > > > > > > > > > > > > > > >> > >> first,
> > > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > behaving
> > > > > and
> > > > > > > > just
> > > > > > > > > > need
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > > >> resources, we
> > > > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> for
> > > > them.
> > > > > It
> > > > > > > is
> > > > > > > > > true
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > users
> > > is
> > > > > > > > > difficult.
> > > > > > > > > > So
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > practice
> > > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > >> > would
> > > > > > > > > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a
> > > > relative
> > > > > > > high
> > > > > > > > > > > > protective
> > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > >> quota
> > > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> for
> > > some
> > > > > > > > > individual
> > > > > > > > > > > > > clients
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > > demand.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > Thanks,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > Jiangjie
> > > > > > > > (Becket)
> > > > > > > > > > Qin
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On
> > > Mon,
> > > > > Feb
> > > > > > > 20,
> > > > > > > > > 2017
> > > > > > > > > > > at
> > > > > > > > > > > > > 5:48
> > > > > > > > > > > > > > > PM,
> > > > > > > > > > > > > > > > > > >> Guozhang
> > > > > > > > > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > wangguoz@gmail.com
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> wrote:
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > This
> > > > is
> > > > > a
> > > > > > > > great
> > > > > > > > > > > > > proposal,
> > > > > > > > > > > > > > > glad
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> see
> > > > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> I
> > am
> > > > > > > inclined
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > > >> throttling, or
> > > > > > > > > > > > > > > > > > >> > >> more
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> processing
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > ratio
> > > > > > > instead
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> well
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > summed
> > > > > my
> > > > > > > > > > rationales
> > > > > > > > > > > > > > above,
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> former
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > has
> > > a
> > > > > good
> > > > > > > > > support
> > > > > > > > > > > for
> > > > > > > > > > > > > > both
> > > > > > > > > > > > > > > > > > >> "protecting
> > > > > > > > > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > "utilizing a
> > > > > > > > > > cluster
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > explain
> > > > > > this
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > end
> > > > > > > > > > > > > > > > users, I
> > > > > > > > > > > > > > > > > > >> find
> > > > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > request
> > > > > > rate
> > > > > > > > > since
> > > > > > > > > > > as
> > > > > > > > > > > > > > > > mentioned
> > > > > > > > > > > > > > > > > > >> above,
> > > > > > > > > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> quite
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > different
> > > > > > > > > "cost",
> > > > > > > > > > > and
> > > > > > > > > > > > > > Kafka
> > > > > > > > > > > > > > > > > today
> > > > > > > > > > > > > > > > > > >> > already
> > > > > > > > > > > > > > > > > > >> > >> > have
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > (produce,
> > > > > > > > fetch,
> > > > > > > > > > > > admin,
> > > > > > > > > > > > > > > > > metadata,
> > > > > > > > > > > > > > > > > > >> etc),
> > > > > > > > > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > throttling
> > > > > > > may
> > > > > > > > > not
> > > > > > > > > > > be
> > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > conservatively.
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > Regarding
> > > > > > to
> > > > > > > > > user
> > > > > > > > > > > > > > reactions
> > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > > >> they
> > > > > > > > > > > > > > > > > > >> > are
> > > > > > > > > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > differ
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > case-by-case,
> > > > > > > > > and
> > > > > > > > > > > need
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > >> > discovered /
> > > > > > > > > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> relative
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > metrics.
> > > > > > So
> > > > > > > in
> > > > > > > > > > other
> > > > > > > > > > > > > words
> > > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > > >> would
> > > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > information
> > > > > > > by
> > > > > > > > > > > simply
> > > > > > > > > > > > > > being
> > > > > > > > > > > > > > > > told
> > > > > > > > > > > > > > > > > > >> "hey,
> > > > > > > > > > > > > > > > > > >> > >> you
> > > > > > > > > > > > > > > > > > >> > >> > are
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > what
> > > > > > > > throttling
> > > > > > > > > > > does;
> > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > >> > take a
> > > > > > > > > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> I'm
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > throttled
> > > > > > > > > probably
> > > > > > > > > > > > > because
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > ..",
> > > > > > > > > > > > > > > > > > >> > which
> > > > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > > > >> > >> > by
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > values:
> > > > > > e.g.
> > > > > > > > > > whether
> > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > > > bombarding
> > > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > [Message clipped]
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > *Todd Palino*
> > > > > > > > > Staff Site Reliability Engineer
> > > > > > > > > Data Infrastructure Streaming
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > linkedin.com/in/toddpalino
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Todd Palino*
> > > > > > > Staff Site Reliability Engineer
> > > > > > > Data Infrastructure Streaming
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > linkedin.com/in/toddpalino
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Rajini,

Thanks for the updated KIP. Looks good. Just one more thing.

50. "Two new metrics request-throttle-time-max and request-throttle-time-min
 will be added to reflect total request processing time based throttling
for all request types including produce/fetch." The most important clients
are producer and consumer, which already have the
produce/fetch-throttle-time-min/max
metrics. Should we just accumulate the throttled time for other requests
into these two existing metrics, instead of introducing new ones? We can
probably add a similar metric for the admin client later on.

Jun


On Thu, Mar 9, 2017 at 2:24 PM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Jun,
>
> 40. Yes you are right, a single value tracking the total exempt time is
> sufficient. Have updated the KIP.
>
> Thank you,
>
> Rajini
>
> On Thu, Mar 9, 2017 at 9:42 PM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Rajini,
> >
> > The updated KIP looks good. Just one more comment.
> >
> > 40. "An additional metric exempt-request-time will also be added for each
> > quota entity for the quota type Request." Should that metric be added for
> > each entity type (e.g., user, client-id, etc)? It seems that value is
> > independent of entity types.
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Mar 9, 2017 at 12:07 PM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > wrote:
> >
> > > Hi Jun,
> > >
> > > Thank you for reviewing the KIP again.
> > >
> > > 30. That is a good idea. In fact, it is one of the advantages of
> > measuring
> > > overall utilization rather than separate values for network and I/O
> > threads
> > > as I had intended initially. Have updated the KIP, thanks.
> > >
> > > 31. Added exempt-request-time metric.
> > >
> > > 32. I had thought of using quota.window.size.seconds * quota.window.num
> > > initially, but felt that would be too big. Even the default of 11
> seconds
> > > is a rather long time to be throttled. With a limit of
> > > quota.window.size.seconds, subsequent requests for that total interval
> of
> > > the samples will also each be throttled for quota.window.size.seconds
> if
> > > the time recorded was very high. So limiting at
> quota.window.size.seconds
> > > limits the throttle time for an individual request, avoiding timeouts
> > where
> > > possible, but still throttles over a period of time.
> > >
> > > 33. Updated to use request_percentage.
> > >
> > >
> > > On Thu, Mar 9, 2017 at 5:40 PM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Rajini,
> > > >
> > > > Thanks for the updated KIP. A few more comments.
> > > >
> > > > 30. Should we just account for the time in network threads in this
> KIP
> > > too?
> > > > The issue with doing this later is that existing quotas may be too
> > small
> > > > and everyone will have to adjust them before upgrading, which is
> > > > inconvenient. If we just do the delaying in the io threads, there
> > > probably
> > > > isn't too much additional work to include the network thread time?
> > > >
> > > > 31. It would be useful for the new metrics to capture the utilization
> > of
> > > > all those requests exempt from request throttling (under sth like
> > > > "exempt"). It's useful for an admin to know how much time is spent
> > there
> > > > too.
> > > >
> > > > 32. "The maximum throttle time for any single request will be the
> quota
> > > > window size (one second by default)." We probably should cap the
> delay
> > at
> > > > quota.window.size.seconds * quota.window.num?
> > > >
> > > > 33. It's unfortunate that we use . in configs and _ in ZK data
> > > structures.
> > > > However, for consistency, request.percentage in ZK probably should be
> > > > request_percentage?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Thu, Mar 9, 2017 at 7:55 AM, Rajini Sivaram <
> > rajinisivaram@gmail.com>
> > > > wrote:
> > > >
> > > > > I have updated the KIP to use "request.percentage" quotas where the
> > > > > percentage is out of a total of (num.io.threads * 100). I have
> added
> > > the
> > > > > other options considered so far under "Rejected Alternatives".
> > > > >
> > > > > To address Todd's concern about per-thread quotas: Even though the
> > > quotas
> > > > > are out of (num.io.threads * 100)  clients are not locked into
> > threads.
> > > > > Utilization is measured as the total across all the I/O threads and
> > 10
> > > %
> > > > > quota can be 1% of 10 threads. Individual quotas can also be
> greater
> > > than
> > > > > 100% if required.
> > > > >
> > > > > Please let me know if there are any other concerns or suggestions.
> > > > >
> > > > > Thank you,
> > > > >
> > > > > Rajini
> > > > >
> > > > > On Wed, Mar 8, 2017 at 10:20 PM, Todd Palino <tp...@gmail.com>
> > > wrote:
> > > > >
> > > > > > Rajini -
> > > > > >
> > > > > > I understand what you’re saying, but the point I’m making is
> that I
> > > > don’t
> > > > > > believe we need to take it into account directly. The CPU
> > utilization
> > > > of
> > > > > > the network threads is directly proportional to the number of
> bytes
> > > > being
> > > > > > sent. The more bytes, the more CPU that is required for SSL (or
> > other
> > > > > > tasks). This is opposed to the request handler threads, where
> there
> > > > are a
> > > > > > number of factors that affect CPU utilization. This means that
> it’s
> > > not
> > > > > > necessary to separately quota network thread byte usage and CPU -
> > if
> > > we
> > > > > > quota byte usage (which we already do), we have fixed the CPU
> usage
> > > at
> > > > a
> > > > > > proportional amount.
> > > > > >
> > > > > > Jun -
> > > > > >
> > > > > > Thanks for the clarification there. I was thinking of the
> > utilization
> > > > > > percentage as being fixed, not what the percentage reflects. I’m
> > not
> > > > tied
> > > > > > to either way of doing it, provided that we do not lock clients
> to
> > a
> > > > > single
> > > > > > thread. For example, if I specify that a given client can use 10%
> > of
> > > a
> > > > > > single thread, that should also mean they can use 1% on 10
> threads.
> > > > > >
> > > > > > -Todd
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <ju...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Hi, Todd,
> > > > > > >
> > > > > > > Thanks for the feedback.
> > > > > > >
> > > > > > > I just want to clarify your second point. If the limit
> percentage
> > > is
> > > > > per
> > > > > > > thread and the thread counts are changed, the absolute
> processing
> > > > limit
> > > > > > for
> > > > > > > existing users haven't changed and there is no need to adjust
> > them.
> > > > On
> > > > > > the
> > > > > > > other hand, if the limit percentage is of total thread pool
> > > capacity
> > > > > and
> > > > > > > the thread counts are changed, the effective processing limit
> > for a
> > > > > user
> > > > > > > will change. So, to preserve the current processing limit,
> > existing
> > > > > user
> > > > > > > limits have to be adjusted. If there is a hardware change, the
> > > > > effective
> > > > > > > processing limit for a user will change in either approach and
> > the
> > > > > > existing
> > > > > > > limit may need to be adjusted. However, hardware changes are
> less
> > > > > common
> > > > > > > than thread pool configuration changes.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <tpalino@gmail.com
> >
> > > > wrote:
> > > > > > >
> > > > > > > > I’ve been following this one on and off, and overall it
> sounds
> > > good
> > > > > to
> > > > > > > me.
> > > > > > > >
> > > > > > > > - The SSL question is a good one. However, that type of
> > overhead
> > > > > should
> > > > > > > be
> > > > > > > > proportional to the bytes rate, so I think that a bytes rate
> > > quota
> > > > > > would
> > > > > > > > still be a suitable way to address it.
> > > > > > > >
> > > > > > > > - I think it’s better to make the quota percentage of total
> > > thread
> > > > > pool
> > > > > > > > capacity, and not percentage of an individual thread. That
> way
> > > you
> > > > > > don’t
> > > > > > > > have to adjust it when you adjust thread counts (tuning,
> > hardware
> > > > > > > changes,
> > > > > > > > etc.)
> > > > > > > >
> > > > > > > >
> > > > > > > > -Todd
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <
> > becket.qin@gmail.com
> > > >
> > > > > > wrote:
> > > > > > > >
> > > > > > > > > I see. Good point about SSL.
> > > > > > > > >
> > > > > > > > > I just asked Todd to take a look.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > >
> > > > > > > > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Jiangjie,
> > > > > > > > > >
> > > > > > > > > > Yes, I agree that byte rate already protects the network
> > > > threads
> > > > > > > > > > indirectly. I am not sure if byte rate fully captures the
> > CPU
> > > > > > > overhead
> > > > > > > > in
> > > > > > > > > > network due to SSL. So, at the high level, we can use
> > request
> > > > > time
> > > > > > > > limit
> > > > > > > > > to
> > > > > > > > > > protect CPU and use byte rate to protect storage and
> > network.
> > > > > > > > > >
> > > > > > > > > > Also, do you think you can get Todd to comment on this
> KIP?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <
> > > > > becket.qin@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi Rajini/Jun,
> > > > > > > > > > >
> > > > > > > > > > > The percentage based reasoning sounds good.
> > > > > > > > > > > One thing I am wondering is that if we assume the
> network
> > > > > thread
> > > > > > > are
> > > > > > > > > just
> > > > > > > > > > > doing the network IO, can we say bytes rate quota is
> > > already
> > > > > sort
> > > > > > > of
> > > > > > > > > > > network threads quota?
> > > > > > > > > > > If we take network threads into the consideration here,
> > > would
> > > > > > that
> > > > > > > be
> > > > > > > > > > > somewhat overlapping with the bytes rate quota?
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Jun,
> > > > > > > > > > > >
> > > > > > > > > > > > Thank you for the explanation, I hadn't realized you
> > > meant
> > > > > > > > percentage
> > > > > > > > > > of
> > > > > > > > > > > > the total thread pool. If everyone is OK with Jun's
> > > > > > suggestion, I
> > > > > > > > > will
> > > > > > > > > > > > update the KIP.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Rajini
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <
> > > jun@confluent.io>
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Let's take your example. Let's say a user sets the
> > > limit
> > > > to
> > > > > > > 50%.
> > > > > > > > I
> > > > > > > > > am
> > > > > > > > > > > not
> > > > > > > > > > > > > sure if it's better to apply the same percentage
> > > > separately
> > > > > > to
> > > > > > > > > > network
> > > > > > > > > > > > and
> > > > > > > > > > > > > io thread pool. For example, for produce requests,
> > most
> > > > of
> > > > > > the
> > > > > > > > time
> > > > > > > > > > > will
> > > > > > > > > > > > be
> > > > > > > > > > > > > spent in the io threads whereas for fetch requests,
> > > most
> > > > of
> > > > > > the
> > > > > > > > > time
> > > > > > > > > > > will
> > > > > > > > > > > > > be in the network threads. So, using the same
> > > percentage
> > > > in
> > > > > > > both
> > > > > > > > > > thread
> > > > > > > > > > > > > pools means one of the pools' resource will be over
> > > > > > allocated.
> > > > > > > > > > > > >
> > > > > > > > > > > > > An alternative way is to simply model network and
> io
> > > > thread
> > > > > > > pool
> > > > > > > > > > > > together.
> > > > > > > > > > > > > If you get 10 io threads and 5 network threads, you
> > get
> > > > > 1500%
> > > > > > > > > request
> > > > > > > > > > > > > processing power. A 50% limit means a total of 750%
> > > > > > processing
> > > > > > > > > power.
> > > > > > > > > > > We
> > > > > > > > > > > > > just add up the time a user request spent in either
> > > > network
> > > > > > or
> > > > > > > io
> > > > > > > > > > > thread.
> > > > > > > > > > > > > If that total exceeds 750% (doesn't matter whether
> > it's
> > > > > spent
> > > > > > > > more
> > > > > > > > > in
> > > > > > > > > > > > > network or io thread), the request will be
> throttled.
> > > > This
> > > > > > > seems
> > > > > > > > > more
> > > > > > > > > > > > > general and is not sensitive to the current
> > > > implementation
> > > > > > > detail
> > > > > > > > > of
> > > > > > > > > > > > having
> > > > > > > > > > > > > a separate network and io thread pool. In the
> future,
> > > if
> > > > > the
> > > > > > > > > > threading
> > > > > > > > > > > > > model changes, the same concept of quota can still
> be
> > > > > > applied.
> > > > > > > > For
> > > > > > > > > > now,
> > > > > > > > > > > > > since it's a bit tricky to add the delay logic in
> the
> > > > > network
> > > > > > > > > thread
> > > > > > > > > > > > pool,
> > > > > > > > > > > > > we could probably just do the delaying only in the
> io
> > > > > threads
> > > > > > > as
> > > > > > > > > you
> > > > > > > > > > > > > suggested earlier.
> > > > > > > > > > > > >
> > > > > > > > > > > > > There is still the orthogonal question of whether a
> > > quota
> > > > > of
> > > > > > > 50%
> > > > > > > > is
> > > > > > > > > > out
> > > > > > > > > > > > of
> > > > > > > > > > > > > 100% or 100% * #total processing threads. My
> feeling
> > is
> > > > > that
> > > > > > > the
> > > > > > > > > > latter
> > > > > > > > > > > > is
> > > > > > > > > > > > > slightly better based on my explanation earlier.
> The
> > > way
> > > > to
> > > > > > > > > describe
> > > > > > > > > > > this
> > > > > > > > > > > > > quota to the users can be "share of elapsed request
> > > > > > processing
> > > > > > > > time
> > > > > > > > > > on
> > > > > > > > > > > a
> > > > > > > > > > > > > single CPU" (similar to top).
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jun
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Agree about the two scenarios.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > But still not sure about a single quota covering
> > both
> > > > > > network
> > > > > > > > > > threads
> > > > > > > > > > > > and
> > > > > > > > > > > > > > I/O threads with per-thread quota. If there are
> 10
> > > I/O
> > > > > > > threads
> > > > > > > > > and
> > > > > > > > > > 5
> > > > > > > > > > > > > > network threads and I want to assign half the
> quota
> > > to
> > > > > > userA,
> > > > > > > > the
> > > > > > > > > > > quota
> > > > > > > > > > > > > > would be 750%. I imagine, internally, we would
> > > convert
> > > > > this
> > > > > > > to
> > > > > > > > > 500%
> > > > > > > > > > > for
> > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > and 250% for network threads to allocate 50% of
> > each
> > > > > pool.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > A couple of scenarios:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1. Admin adds 1 extra network thread. To retain
> > 50%,
> > > > > admin
> > > > > > > > needs
> > > > > > > > > to
> > > > > > > > > > > now
> > > > > > > > > > > > > > allocate 800% for each user. Or increase the
> quota
> > > for
> > > > a
> > > > > > few
> > > > > > > > > users.
> > > > > > > > > > > To
> > > > > > > > > > > > > me,
> > > > > > > > > > > > > > it feels like admin needs to convert 50% to 800%
> > and
> > > > > Kafka
> > > > > > > > > > internally
> > > > > > > > > > > > > needs
> > > > > > > > > > > > > > to convert 800% to (500%, 300%). Everyone using
> > just
> > > > 50%
> > > > > > > feels
> > > > > > > > a
> > > > > > > > > > lot
> > > > > > > > > > > > > > simpler.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2. We decide to add some other thread to this
> list.
> > > > Admin
> > > > > > > needs
> > > > > > > > > to
> > > > > > > > > > > know
> > > > > > > > > > > > > > exactly how many threads form the maximum quota.
> > And
> > > we
> > > > > can
> > > > > > > be
> > > > > > > > > > > changing
> > > > > > > > > > > > > > this between broker versions as we add more to
> the
> > > > list.
> > > > > > > Again
> > > > > > > > a
> > > > > > > > > > > single
> > > > > > > > > > > > > > overall percent would be a lot simpler.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > There were others who were unconvinced by a
> single
> > > > > percent
> > > > > > > from
> > > > > > > > > the
> > > > > > > > > > > > > initial
> > > > > > > > > > > > > > proposal and were happier with thread units
> similar
> > > to
> > > > > CPU
> > > > > > > > units,
> > > > > > > > > > so
> > > > > > > > > > > I
> > > > > > > > > > > > am
> > > > > > > > > > > > > > ok with going with per-thread quotas (as units or
> > > > > percent).
> > > > > > > > Just
> > > > > > > > > > not
> > > > > > > > > > > > sure
> > > > > > > > > > > > > > it makes it easier for admin in all cases.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <
> > > > > jun@confluent.io>
> > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Consider modeling as n * 100% unit. For 2), the
> > > > > question
> > > > > > is
> > > > > > > > > > what's
> > > > > > > > > > > > > > causing
> > > > > > > > > > > > > > > the I/O threads to be saturated. It's unlikely
> > that
> > > > all
> > > > > > > > users'
> > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > have increased at the same. A more likely case
> is
> > > > that
> > > > > a
> > > > > > > few
> > > > > > > > > > > isolated
> > > > > > > > > > > > > > > users' utilization have increased. If so, after
> > > > > > increasing
> > > > > > > > the
> > > > > > > > > > > number
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > threads, the admin just needs to adjust the
> quota
> > > > for a
> > > > > > few
> > > > > > > > > > > isolated
> > > > > > > > > > > > > > users,
> > > > > > > > > > > > > > > which is expected and is less work.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Consider modeling as 1 * 100% unit. For 1), all
> > > > users'
> > > > > > > quota
> > > > > > > > > need
> > > > > > > > > > > to
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > adjusted, which is unexpected and is more work.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > So, to me, the n * 100% model seems more
> > > convenient.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > As for future extension to cover network thread
> > > > > > > utilization,
> > > > > > > > I
> > > > > > > > > > was
> > > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > > that one way is to simply model the capacity as
> > (n
> > > +
> > > > > m) *
> > > > > > > > 100%
> > > > > > > > > > > unit,
> > > > > > > > > > > > > > where
> > > > > > > > > > > > > > > n and m are the number of network and i/o
> > threads,
> > > > > > > > > respectively.
> > > > > > > > > > > > Then,
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > each user, we can just add up the utilization
> in
> > > the
> > > > > > > network
> > > > > > > > > and
> > > > > > > > > > > the
> > > > > > > > > > > > > i/o
> > > > > > > > > > > > > > > thread. If we do this, we don't need a new type
> > of
> > > > > quota.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini
> Sivaram <
> > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > If we use request.percentage as the
> percentage
> > > used
> > > > > in
> > > > > > a
> > > > > > > > > single
> > > > > > > > > > > I/O
> > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > the total percentage being allocated will be
> > > > > > > > num.io.threads *
> > > > > > > > > > 100
> > > > > > > > > > > > for
> > > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > threads and num.network.threads * 100 for
> > network
> > > > > > > threads.
> > > > > > > > A
> > > > > > > > > > > single
> > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > covering the two as a percentage wouldn't
> quite
> > > > work
> > > > > if
> > > > > > > you
> > > > > > > > > > want
> > > > > > > > > > > to
> > > > > > > > > > > > > > > > allocate the same proportion in both cases.
> If
> > we
> > > > > want
> > > > > > to
> > > > > > > > > treat
> > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > separate units, won't we need two quota
> > > > > configurations
> > > > > > > > > > regardless
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > whether we use units or percentage? Perhaps I
> > > > > > > misunderstood
> > > > > > > > > > your
> > > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I think there are two cases:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >    1. The use case that you mentioned where
> an
> > > > admin
> > > > > is
> > > > > > > > > adding
> > > > > > > > > > > more
> > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > >    and decides to add more I/O threads and
> > > expects
> > > > to
> > > > > > > find
> > > > > > > > > free
> > > > > > > > > > > > quota
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >    allocate for new users.
> > > > > > > > > > > > > > > >    2. Admin adds more I/O threads because the
> > I/O
> > > > > > threads
> > > > > > > > are
> > > > > > > > > > > > > saturated
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > >    there are cores available to allocate,
> even
> > > > though
> > > > > > the
> > > > > > > > > > number
> > > > > > > > > > > or
> > > > > > > > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > If we allocated treated I/O threads as a
> single
> > > > unit
> > > > > of
> > > > > > > > 100%,
> > > > > > > > > > all
> > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > quotas need to be reallocated for 1). If we
> > > > allocated
> > > > > > I/O
> > > > > > > > > > threads
> > > > > > > > > > > > as
> > > > > > > > > > > > > n
> > > > > > > > > > > > > > > > units with n*100%, all user quotas need to be
> > > > > > reallocated
> > > > > > > > for
> > > > > > > > > > 2),
> > > > > > > > > > > > > > > otherwise
> > > > > > > > > > > > > > > > some of the new threads may just not be used.
> > > > Either
> > > > > > way
> > > > > > > it
> > > > > > > > > > > should
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > easy
> > > > > > > > > > > > > > > > to write a script to decrease/increase quotas
> > by
> > > a
> > > > > > > multiple
> > > > > > > > > for
> > > > > > > > > > > all
> > > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > So it really boils down to which quota unit
> is
> > > most
> > > > > > > > intuitive
> > > > > > > > > > in
> > > > > > > > > > > > > terms
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > configuration. And from the discussion so
> far,
> > it
> > > > > feels
> > > > > > > > like
> > > > > > > > > > > > opinion
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > divided on whether quotas should be carved
> out
> > of
> > > > an
> > > > > > > > absolute
> > > > > > > > > > > 100%
> > > > > > > > > > > > > (or
> > > > > > > > > > > > > > 1
> > > > > > > > > > > > > > > > unit) or be relative to the number of threads
> > > > (n*100%
> > > > > > or
> > > > > > > n
> > > > > > > > > > > units).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <
> > > > > > > jun@confluent.io>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Another way to express an absolute limit is
> > to
> > > > use
> > > > > > > > > > > > > > request.percentage,
> > > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > treat it as the percentage used in a single
> > > > request
> > > > > > > > > handling
> > > > > > > > > > > > > thread.
> > > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > > now, the request handling threads can be
> just
> > > the
> > > > > io
> > > > > > > > > threads.
> > > > > > > > > > > In
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > future, they can cover the network threads
> as
> > > > well.
> > > > > > > This
> > > > > > > > is
> > > > > > > > > > > > similar
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > top reports CPU usage and may be a bit
> easier
> > > for
> > > > > > > people
> > > > > > > > to
> > > > > > > > > > > > > > understand.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <
> > > > > > > > > jun@confluent.io>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > 2. Regarding request.unit vs
> > > > request.percentage.
> > > > > I
> > > > > > > > > started
> > > > > > > > > > > with
> > > > > > > > > > > > > > > > > > request.percentage too. The reasoning for
> > > > > > > request.unit
> > > > > > > > is
> > > > > > > > > > the
> > > > > > > > > > > > > > > > following.
> > > > > > > > > > > > > > > > > > Suppose that the capacity has been
> reached
> > > on a
> > > > > > > broker
> > > > > > > > > and
> > > > > > > > > > > the
> > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > > to add a new user. A simple way to
> increase
> > > the
> > > > > > > > capacity
> > > > > > > > > is
> > > > > > > > > > > to
> > > > > > > > > > > > > > > increase
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > number of io threads, assuming there are
> > > still
> > > > > > enough
> > > > > > > > > > cores.
> > > > > > > > > > > If
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > > is based on percentage, the additional
> > > capacity
> > > > > > > > > > automatically
> > > > > > > > > > > > > gets
> > > > > > > > > > > > > > > > > > distributed to existing users and we
> > haven't
> > > > > really
> > > > > > > > > carved
> > > > > > > > > > > out
> > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > > additional resource for the new user.
> Now,
> > is
> > > > it
> > > > > > easy
> > > > > > > > > for a
> > > > > > > > > > > > user
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > reason
> > > > > > > > > > > > > > > > > > about 0.1 unit vs 10%. My feeling is that
> > > both
> > > > > are
> > > > > > > hard
> > > > > > > > > and
> > > > > > > > > > > > have
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > > configured empirically. Not sure if
> > > percentage
> > > > is
> > > > > > > > > obviously
> > > > > > > > > > > > > easier
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > reason about.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay
> Kreps
> > <
> > > > > > > > > > jay@confluent.io
> > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> 1. Even though the implementation of
> this
> > > > quota
> > > > > is
> > > > > > > > only
> > > > > > > > > > > using
> > > > > > > > > > > > io
> > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > >> time, i think we should call it
> something
> > > like
> > > > > > > > > > > "request-time".
> > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > >> give us flexibility to improve the
> > > > > implementation
> > > > > > to
> > > > > > > > > cover
> > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > >> in the future and will avoid exposing
> > > internal
> > > > > > > details
> > > > > > > > > > like
> > > > > > > > > > > > our
> > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> 2. Jun/Roger, I get what you are trying
> to
> > > fix
> > > > > but
> > > > > > > the
> > > > > > > > > > idea
> > > > > > > > > > > of
> > > > > > > > > > > > > > > > > >> thread/units
> > > > > > > > > > > > > > > > > >> is super unintuitive as a user-facing
> > knob.
> > > I
> > > > > had
> > > > > > to
> > > > > > > > > read
> > > > > > > > > > > the
> > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > >> eight times to understand this. I'm not
> > sure
> > > > > that
> > > > > > > your
> > > > > > > > > > point
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > >> increasing the number of threads is a
> > > problem
> > > > > > with a
> > > > > > > > > > > > > > > percentage-based
> > > > > > > > > > > > > > > > > >> value, it really depends on whether the
> > user
> > > > > > thinks
> > > > > > > > > about
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > "percentage
> > > > > > > > > > > > > > > > > >> of request processing time" or "thread
> > > units".
> > > > > If
> > > > > > > they
> > > > > > > > > > think
> > > > > > > > > > > > "I
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > >> allocated 10% of my request processing
> > time
> > > to
> > > > > > user
> > > > > > > x"
> > > > > > > > > > then
> > > > > > > > > > > it
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > bug
> > > > > > > > > > > > > > > > > >> that increasing the thread count
> decreases
> > > > that
> > > > > > > > percent
> > > > > > > > > as
> > > > > > > > > > > it
> > > > > > > > > > > > > does
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> current proposal. As a practical matter
> I
> > > > think
> > > > > > the
> > > > > > > > only
> > > > > > > > > > way
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > actually
> > > > > > > > > > > > > > > > > >> reason about this is as a percent---I
> just
> > > > don't
> > > > > > > > believe
> > > > > > > > > > > > people
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > going
> > > > > > > > > > > > > > > > > >> to think, "ah, 4.3 thread units, that is
> > the
> > > > > right
> > > > > > > > > > amount!".
> > > > > > > > > > > > > > > Instead I
> > > > > > > > > > > > > > > > > >> think they have to understand this
> thread
> > > unit
> > > > > > > > concept,
> > > > > > > > > > > figure
> > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > >> they have set in number of threads,
> > compute
> > > a
> > > > > > > percent
> > > > > > > > > and
> > > > > > > > > > > then
> > > > > > > > > > > > > > come
> > > > > > > > > > > > > > > up
> > > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > > >> the number of thread units, and these
> will
> > > all
> > > > > be
> > > > > > > > wrong
> > > > > > > > > if
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > >> count changes. I also think this ties us
> > to
> > > > > > > throttling
> > > > > > > > > the
> > > > > > > > > > > I/O
> > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > >> pool,
> > > > > > > > > > > > > > > > > >> which may not be where we want to end
> up.
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> 3. For what it's worth I do think
> having a
> > > > > single
> > > > > > > > > > > throttle_ms
> > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > >> the responses that combines all
> throttling
> > > > from
> > > > > > all
> > > > > > > > > quotas
> > > > > > > > > > > is
> > > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> simplest. There could be a use case for
> > > having
> > > > > > > > separate
> > > > > > > > > > > fields
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > each,
> > > > > > > > > > > > > > > > > >> but I think that is actually harder to
> > > > > use/monitor
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > common
> > > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > >> unless someone has a use case I think
> just
> > > one
> > > > > > > should
> > > > > > > > be
> > > > > > > > > > > fine.
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini
> > > > Sivaram
> > > > > <
> > > > > > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > >> > I have updated the KIP based on the
> > > > > discussions
> > > > > > so
> > > > > > > > > far.
> > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > >> > Regards,
> > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > >> > Rajini
> > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM,
> Rajini
> > > > > > Sivaram <
> > > > > > > > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > > Ismael #1. It makes sense not to
> > > throttle
> > > > > > > > > inter-broker
> > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > >> > > LeaderAndIsr etc. The simplest way
> to
> > > > ensure
> > > > > > > that
> > > > > > > > > > > clients
> > > > > > > > > > > > > > cannot
> > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > >> > these
> > > > > > > > > > > > > > > > > >> > > requests to bypass quotas for DoS
> > > attacks
> > > > is
> > > > > > to
> > > > > > > > > ensure
> > > > > > > > > > > > that
> > > > > > > > > > > > > > ACLs
> > > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > > >> > > clients from using these requests
> and
> > > > > > > unauthorized
> > > > > > > > > > > > requests
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > >> included
> > > > > > > > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking
> > that
> > > > > these
> > > > > > > > quotas
> > > > > > > > > > can
> > > > > > > > > > > > > > return
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > >> > separate
> > > > > > > > > > > > > > > > > >> > > throttle time, and all utilization
> > based
> > > > > > quotas
> > > > > > > > > could
> > > > > > > > > > > use
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > > >> > > (we won't add another one for
> network
> > > > thread
> > > > > > > > > > utilization
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> instance).
> > > > > > > > > > > > > > > > > >> > But
> > > > > > > > > > > > > > > > > >> > > perhaps it makes sense to keep byte
> > rate
> > > > > > quotas
> > > > > > > > > > separate
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > > > > > > > >> > > responses to provide separate
> metrics?
> > > > Agree
> > > > > > > with
> > > > > > > > > > Ismael
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> name of
> > > > > > > > > > > > > > > > > >> > > the existing field should be changed
> > if
> > > we
> > > > > > have
> > > > > > > > two.
> > > > > > > > > > > Happy
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > switch
> > > > > > > > > > > > > > > > > >> to a
> > > > > > > > > > > > > > > > > >> > > single combined throttle time if
> that
> > is
> > > > > > > > sufficient.
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP.
> > Will
> > > > use
> > > > > > dot
> > > > > > > > > > > separated
> > > > > > > > > > > > > > name
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> new
> > > > > > > > > > > > > > > > > >> > > property. Replication quotas use dot
> > > > > > separated,
> > > > > > > so
> > > > > > > > > it
> > > > > > > > > > > will
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > >> consistent
> > > > > > > > > > > > > > > > > >> > > with all properties except byte rate
> > > > quotas.
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > > Radai: #1 Request processing time
> > rather
> > > > > than
> > > > > > > > > request
> > > > > > > > > > > rate
> > > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > > chosen
> > > > > > > > > > > > > > > > > >> > > because the time per request can
> vary
> > > > > > > > significantly
> > > > > > > > > > > > between
> > > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > > > > > > > > > > >> > > #2 Two separate quotas for
> > > > > heartbeats/regular
> > > > > > > > > requests
> > > > > > > > > > > > feel
> > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > >> > > configuration and more metrics.
> Since
> > > most
> > > > > > users
> > > > > > > > > would
> > > > > > > > > > > set
> > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > >> higher
> > > > > > > > > > > > > > > > > >> > > than the expected usage and quotas
> are
> > > > more
> > > > > > of a
> > > > > > > > > > safety
> > > > > > > > > > > > > net, a
> > > > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > > > > > > > >> > >  #3 The number of requests in
> > purgatory
> > > is
> > > > > > > limited
> > > > > > > > > by
> > > > > > > > > > > the
> > > > > > > > > > > > > > number
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > >> > active
> > > > > > > > > > > > > > > > > >> > > connections since only one request
> per
> > > > > > > connection
> > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > > > > > throttled
> > > > > > > > > > > > > > > > > >> at a
> > > > > > > > > > > > > > > > > >> > > time.
> > > > > > > > > > > > > > > > > >> > > #4 As with byte rate quotas, to use
> > the
> > > > full
> > > > > > > > > allocated
> > > > > > > > > > > > > quotas,
> > > > > > > > > > > > > > > > > >> > > clients/users would need to use
> > > partitions
> > > > > > that
> > > > > > > > are
> > > > > > > > > > > > > > distributed
> > > > > > > > > > > > > > > > > across
> > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > >> > > cluster. The alternative of using
> > > > > cluster-wide
> > > > > > > > > quotas
> > > > > > > > > > > > > instead
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > >> > per-broker
> > > > > > > > > > > > > > > > > >> > > quotas would be far too complex to
> > > > > implement.
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > > Dong : We currently have two
> > > > > > ClientQuotaManagers
> > > > > > > > for
> > > > > > > > > > > quota
> > > > > > > > > > > > > > types
> > > > > > > > > > > > > > > > > Fetch
> > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > >> > > Produce. A new one will be added for
> > > > > IOThread,
> > > > > > > > which
> > > > > > > > > > > > manages
> > > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > > >> > I/O
> > > > > > > > > > > > > > > > > >> > > thread utilization. This will not
> > update
> > > > the
> > > > > > > Fetch
> > > > > > > > > or
> > > > > > > > > > > > > Produce
> > > > > > > > > > > > > > > > > >> queue-size,
> > > > > > > > > > > > > > > > > >> > > but will have a separate metric for
> > the
> > > > > > > > > queue-size.  I
> > > > > > > > > > > > > wasn't
> > > > > > > > > > > > > > > > > >> planning to
> > > > > > > > > > > > > > > > > >> > > add any additional metrics apart
> from
> > > the
> > > > > > > > equivalent
> > > > > > > > > > > ones
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > >> > > quotas as part of this KIP. Ratio of
> > > > > byte-rate
> > > > > > > to
> > > > > > > > > I/O
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > >> utilization
> > > > > > > > > > > > > > > > > >> > > could be slightly misleading since
> it
> > > > > depends
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > > > sequence
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > >> > requests.
> > > > > > > > > > > > > > > > > >> > > But we can look into more metrics
> > after
> > > > the
> > > > > > KIP
> > > > > > > is
> > > > > > > > > > > > > implemented
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > >> > required.
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > > I think we need to limit the maximum
> > > delay
> > > > > > since
> > > > > > > > all
> > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > >> > > throttled. If a client has a quota
> of
> > > > 0.001
> > > > > > > units
> > > > > > > > > and
> > > > > > > > > > a
> > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > >> > used
> > > > > > > > > > > > > > > > > >> > > 50ms, we don't want to delay all
> > > requests
> > > > > from
> > > > > > > the
> > > > > > > > > > > client
> > > > > > > > > > > > by
> > > > > > > > > > > > > > 50
> > > > > > > > > > > > > > > > > >> seconds,
> > > > > > > > > > > > > > > > > >> > > throwing the client out of all its
> > > > consumer
> > > > > > > > groups.
> > > > > > > > > > The
> > > > > > > > > > > > > issue
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > >> if
> > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > >> > > user is allocated a quota that is
> > > > > insufficient
> > > > > > > to
> > > > > > > > > > > process
> > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > large
> > > > > > > > > > > > > > > > > >> > > request. The expectation is that the
> > > units
> > > > > > > > allocated
> > > > > > > > > > per
> > > > > > > > > > > > > user
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > >> > much
> > > > > > > > > > > > > > > > > >> > > higher than the time taken to
> process
> > > one
> > > > > > > request
> > > > > > > > > and
> > > > > > > > > > > the
> > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > > >> > > seldom be applied. Agree this needs
> > > proper
> > > > > > > > > > > documentation.
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM,
> > radai <
> > > > > > > > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > > >> > >> @jun: i wasnt concerned about tying
> > up
> > > a
> > > > > > > request
> > > > > > > > > > > > processing
> > > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > > >> but
> > > > > > > > > > > > > > > > > >> > >> IIUC the code does still read the
> > > entire
> > > > > > > request
> > > > > > > > > out,
> > > > > > > > > > > > which
> > > > > > > > > > > > > > > might
> > > > > > > > > > > > > > > > > >> add-up
> > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM,
> > Dong
> > > > Lin
> > > > > <
> > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > The current KIP says that the
> > maximum
> > > > > delay
> > > > > > > > will
> > > > > > > > > be
> > > > > > > > > > > > > reduced
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> window
> > > > > > > > > > > > > > > > > >> > >> size
> > > > > > > > > > > > > > > > > >> > >> > if it is larger than the window
> > > size. I
> > > > > > have
> > > > > > > a
> > > > > > > > > > > concern
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > this:
> > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > 1) This essentially means that
> the
> > > user
> > > > > is
> > > > > > > > > allowed
> > > > > > > > > > to
> > > > > > > > > > > > > > exceed
> > > > > > > > > > > > > > > > > their
> > > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > > >> > >> > over a long period of time. Can
> you
> > > > > provide
> > > > > > > an
> > > > > > > > > > upper
> > > > > > > > > > > > > bound
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > 2) What is the motivation for cap
> > the
> > > > > > maximum
> > > > > > > > > delay
> > > > > > > > > > > by
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > window
> > > > > > > > > > > > > > > > > >> > size?
> > > > > > > > > > > > > > > > > >> > >> I
> > > > > > > > > > > > > > > > > >> > >> > am wondering if there is better
> > > > > alternative
> > > > > > > to
> > > > > > > > > > > address
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > 3) It means that the existing
> > > > > > metric-related
> > > > > > > > > config
> > > > > > > > > > > > will
> > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > > >> > >> > directly impact on the mechanism
> of
> > > > this
> > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > > >> quota.
> > > > > > > > > > > > > > > > > >> > The
> > > > > > > > > > > > > > > > > >> > >> > may be an important change
> > depending
> > > on
> > > > > the
> > > > > > > > > answer
> > > > > > > > > > to
> > > > > > > > > > > > 1)
> > > > > > > > > > > > > > > above.
> > > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > > >> > >> probably
> > > > > > > > > > > > > > > > > >> > >> > need to document this more
> > > explicitly.
> > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM,
> > > Dong
> > > > > Lin
> > > > > > <
> > > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > > Yeah you are right. I thought
> it
> > > > wasn't
> > > > > > > > because
> > > > > > > > > > at
> > > > > > > > > > > > > > LinkedIn
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > >> will
> > > > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > > >> > >> > > much pressure on inGraph to
> > expose
> > > > > those
> > > > > > > > > > > per-clientId
> > > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > > >> > >> ended
> > > > > > > > > > > > > > > > > >> > >> > > up printing them periodically
> to
> > > > local
> > > > > > log.
> > > > > > > > > Never
> > > > > > > > > > > > mind
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> not
> > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > > - I agree with Jay that we
> > probably
> > > > > don't
> > > > > > > > want
> > > > > > > > > to
> > > > > > > > > > > > add a
> > > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > > > > > >> > >> > > every quota ProduceResponse or
> > > > > > > FetchResponse.
> > > > > > > > > Is
> > > > > > > > > > > > there
> > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > >> use-case
> > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > >> > >> > > having separate throttle-time
> > > fields
> > > > > for
> > > > > > > > > > > > > byte-rate-quota
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-quota? You
> > probably
> > > > need
> > > > > > to
> > > > > > > > > > document
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > >> > interface
> > > > > > > > > > > > > > > > > >> > >> > > change if you plan to add new
> > field
> > > > in
> > > > > > any
> > > > > > > > > > request.
> > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > > - I don't think IOThread
> belongs
> > to
> > > > > > > > quotaType.
> > > > > > > > > > The
> > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > >> > >> types
> > > > > > > > > > > > > > > > > >> > >> > > (i.e.
> > > Produce/Fetch/LeaderReplicatio
> > > > > > > > > > > > > > n/FollowerReplication)
> > > > > > > > > > > > > > > > > >> identify
> > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > >> > >> > > type of request that are
> > throttled,
> > > > not
> > > > > > the
> > > > > > > > > quota
> > > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > > - If a request is throttled due
> > to
> > > > this
> > > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > > >> quota,
> > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > > existing queue-size metric in
> > > > > > > > > ClientQuotaManager
> > > > > > > > > > > > > > > incremented?
> > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > > - In the interest of providing
> > > guide
> > > > > line
> > > > > > > for
> > > > > > > > > > admin
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > decide
> > > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-based quota and
> > for
> > > > user
> > > > > > to
> > > > > > > > > > > understand
> > > > > > > > > > > > > its
> > > > > > > > > > > > > > > > > impact
> > > > > > > > > > > > > > > > > >> on
> > > > > > > > > > > > > > > > > >> > >> their
> > > > > > > > > > > > > > > > > >> > >> > > traffic, would it be useful to
> > > have a
> > > > > > > metric
> > > > > > > > > that
> > > > > > > > > > > > shows
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> overall
> > > > > > > > > > > > > > > > > >> > >> > > byte-rate per io-thread-unit?
> Can
> > > we
> > > > > also
> > > > > > > > show
> > > > > > > > > > > this a
> > > > > > > > > > > > > > > > > >> per-clientId
> > > > > > > > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25
> AM,
> > > Jun
> > > > > Rao
> > > > > > <
> > > > > > > > > > > > > > jun@confluent.io
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > >> > >> > >> For #3, typically, an admin
> > won't
> > > > > > > configure
> > > > > > > > > more
> > > > > > > > > > > io
> > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > than
> > > > > > > > > > > > > > > > > >> > CPU
> > > > > > > > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > > > > > > > >> > >> > >> but it's possible for an admin
> > to
> > > > > start
> > > > > > > with
> > > > > > > > > > fewer
> > > > > > > > > > > > io
> > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > > > > >> > >> cores
> > > > > > > > > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > >> > >> > >> I think the throttleTime
> sensor
> > on
> > > > the
> > > > > > > > broker
> > > > > > > > > > > tells
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > > > > > > > >> > >> > >> user/clentId is throttled or
> > not.
> > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > >> > >> > >> The reasoning for delaying the
> > > > > throttled
> > > > > > > > > > requests
> > > > > > > > > > > on
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > > >> > >> instead
> > > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > > >> > >> > >> returning an error immediately
> > is
> > > > that
> > > > > > the
> > > > > > > > > > latter
> > > > > > > > > > > > has
> > > > > > > > > > > > > no
> > > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > >> > >> > >> client from retrying
> > immediately,
> > > > > which
> > > > > > > will
> > > > > > > > > > make
> > > > > > > > > > > > > things
> > > > > > > > > > > > > > > > > worse.
> > > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > > >> > >> > >> delaying logic is based off a
> > > delay
> > > > > > > queue. A
> > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > > expiration
> > > > > > > > > > > > > > > > > >> > >> thread
> > > > > > > > > > > > > > > > > >> > >> > >> just waits on the next to be
> > > expired
> > > > > > > > request.
> > > > > > > > > > So,
> > > > > > > > > > > it
> > > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > tie
> > > > > > > > > > > > > > > > > >> > up a
> > > > > > > > > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07
> AM,
> > > > > Ismael
> > > > > > > > Juma <
> > > > > > > > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > >> > Regarding 1, I definitely
> like
> > > the
> > > > > > > > > simplicity
> > > > > > > > > > of
> > > > > > > > > > > > > > > keeping a
> > > > > > > > > > > > > > > > > >> single
> > > > > > > > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > > > > > > > >> > >> > >> > time field in the response.
> > The
> > > > > > downside
> > > > > > > > is
> > > > > > > > > > that
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > > >> > metrics
> > > > > > > > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > > > > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > > > > > > > percentage`
> > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.
> > > > ratio`.
> > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43
> > PM,
> > > > Jay
> > > > > > > > Kreps <
> > > > > > > > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > >> > >    1. Isn't it the case
> that
> > > the
> > > > > > > > > throttling
> > > > > > > > > > > time
> > > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > > >> > >> > should
> > > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > > >> > >> > >> > >    the total time your
> > request
> > > > was
> > > > > > > > > throttled
> > > > > > > > > > > > > > > > irrespective
> > > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > > > > > > > >> > >> > >> > >    caused that. Limiting
> it
> > to
> > > > > byte
> > > > > > > rate
> > > > > > > > > > quota
> > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > > >> > >> > >> > >    I don't think we want
> to
> > > end
> > > > up
> > > > > > > > adding
> > > > > > > > > > new
> > > > > > > > > > > > > fields
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> response
> > > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > > > > > > > >> > >> > >> > >    single thing we quota,
> > > right?
> > > > > > > > > > > > > > > > > >> > >> > >> > >    2. I don't think we
> > should
> > > > make
> > > > > > > this
> > > > > > > > > > quota
> > > > > > > > > > > > > > > > specifically
> > > > > > > > > > > > > > > > > >> > about
> > > > > > > > > > > > > > > > > >> > >> io
> > > > > > > > > > > > > > > > > >> > >> > >> > >    threads. Once we
> > introduce
> > > > > these
> > > > > > > > quotas
> > > > > > > > > > > > people
> > > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > >    be enforced (and if
> they
> > > > aren't
> > > > > > it
> > > > > > > > may
> > > > > > > > > > > cause
> > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > outage).
> > > > > > > > > > > > > > > > > >> As
> > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > > > > > > > >> > >> > >> > >    are a bit more
> sensitive
> > > than
> > > > > > > normal
> > > > > > > > > > > > configs, I
> > > > > > > > > > > > > > > > think.
> > > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > > >> > >> > current
> > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > >> > >> > >> > >    pools seem like
> something
> > > of
> > > > an
> > > > > > > > > > > > implementation
> > > > > > > > > > > > > > > detail
> > > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > >    user-facing quotas
> should
> > > be
> > > > > > > involved
> > > > > > > > > > > with. I
> > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > >> might
> > > > > > > > > > > > > > > > > >> > >> be
> > > > > > > > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > >    make this a general
> > > > > request-time
> > > > > > > > > throttle
> > > > > > > > > > > > with
> > > > > > > > > > > > > no
> > > > > > > > > > > > > > > > > >> mention in
> > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > > > > > > > >> > >> > >> > >    about I/O threads and
> > > simply
> > > > > > > > > acknowledge
> > > > > > > > > > > the
> > > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > > > > > > > >> > >> > >> > >    may someday fix) in the
> > > docs
> > > > > that
> > > > > > > > this
> > > > > > > > > > > covers
> > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > >> > >> after
> > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > >> > >> > >> > >    thread is read off the
> > > > network.
> > > > > > > > > > > > > > > > > >> > >> > >> > >    3. As such I think the
> > > right
> > > > > > > > interface
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > > > > > > > >> > >> > >> > >    like
> percent_request_time
> > > and
> > > > > be
> > > > > > in
> > > > > > > > > > > > {0,...100}
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I
> think
> > > > > "ratio"
> > > > > > > is
> > > > > > > > > the
> > > > > > > > > > > > > > > terminology
> > > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > >> used
> > > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > > > > > > > >> > >> > >> > >    is between 0 and 1 in
> the
> > > > other
> > > > > > > > > metrics,
> > > > > > > > > > > > > right?)
> > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at
> 3:45
> > > AM,
> > > > > > > Rajini
> > > > > > > > > > > Sivaram
> > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > Thank you for the
> > feedback.
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang : I have
> updated
> > > the
> > > > > > > section
> > > > > > > > on
> > > > > > > > > > > > > > > co-existence
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > >> byte
> > > > > > > > > > > > > > > > > >> > >> rate
> > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added
> much
> > > > detail
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > > metrics
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > >> sensors
> > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > > > > > > > >> > >> > >> > > > going to be very similar
> > to
> > > > the
> > > > > > > > existing
> > > > > > > > > > > > metrics
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > >> sensors.
> > > > > > > > > > > > > > > > > >> > >> To
> > > > > > > > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > > > > > > > >> > >> > >> > > > confusion, I have now
> > added
> > > > more
> > > > > > > > detail.
> > > > > > > > > > All
> > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > > > > > > > >> > >> > >> > > > "quotaType" and all
> > sensors
> > > > have
> > > > > > > names
> > > > > > > > > > > > starting
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > > > > > > > >> > >> > >> > > > quotaType is
> > Produce/Fetch/
> > > > > > > > > > > LeaderReplication/
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > FollowerReplication/*IOThread*
> > > > > ).
> > > > > > > > > > > > > > > > > >> > >> > >> > > > So there will be no
> reuse
> > of
> > > > > > > existing
> > > > > > > > > > > > > > > metrics/sensors.
> > > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > >> > new
> > > > > > > > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > request processing time
> > > based
> > > > > > > > throttling
> > > > > > > > > > > will
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > >> completely
> > > > > > > > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > existing
> metrics/sensors,
> > > but
> > > > > will
> > > > > > > be
> > > > > > > > > > > > consistent
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > >> format.
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > The existing
> > > throttle_time_ms
> > > > > > field
> > > > > > > in
> > > > > > > > > > > > > > produce/fetch
> > > > > > > > > > > > > > > > > >> > responses
> > > > > > > > > > > > > > > > > >> > >> > will
> > > > > > > > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > >> > >> > >> > > > impacted by this KIP.
> That
> > > > will
> > > > > > > > continue
> > > > > > > > > > to
> > > > > > > > > > > > > return
> > > > > > > > > > > > > > > > > >> byte-rate
> > > > > > > > > > > > > > > > > >> > >> based
> > > > > > > > > > > > > > > > > >> > >> > >> > > > throttling times. In
> > > > addition, a
> > > > > > new
> > > > > > > > > field
> > > > > > > > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > >> > >> > >> > > > added to return request
> > > quota
> > > > > > based
> > > > > > > > > > > throttling
> > > > > > > > > > > > > > > times.
> > > > > > > > > > > > > > > > > >> These
> > > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > > > > > > > >> > >> > >> > > > as new metrics on the
> > > > > client-side.
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > Since all metrics and
> > > sensors
> > > > > are
> > > > > > > > > > different
> > > > > > > > > > > > for
> > > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > > > > > > > >> > >> > >> > > > believe there is already
> > > > > > sufficient
> > > > > > > > > > metrics
> > > > > > > > > > > to
> > > > > > > > > > > > > > > monitor
> > > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > > > > > > > >> > >> > >> > > > client and broker side
> for
> > > > each
> > > > > > type
> > > > > > > > of
> > > > > > > > > > > > > > throttling.
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at
> > 4:32
> > > > AM,
> > > > > > > Dong
> > > > > > > > > Lin
> > > > > > > > > > <
> > > > > > > > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > I think it makes a lot
> > of
> > > > > sense
> > > > > > to
> > > > > > > > use
> > > > > > > > > > > > > > > > io_thread_units
> > > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > > >> > >> metric
> > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > user's traffic here.
> > LGTM
> > > > > > > overall. I
> > > > > > > > > > have
> > > > > > > > > > > > some
> > > > > > > > > > > > > > > > > questions
> > > > > > > > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Can you be more
> > specific
> > > > in
> > > > > > the
> > > > > > > > KIP
> > > > > > > > > > what
> > > > > > > > > > > > > > sensors
> > > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > example, it will be
> > useful
> > > > to
> > > > > > > > specify
> > > > > > > > > > the
> > > > > > > > > > > > name
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > - We currently have
> > > > > > throttle-time
> > > > > > > > and
> > > > > > > > > > > > > queue-size
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > Are you going to have
> > > > separate
> > > > > > > > > > > throttle-time
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > >> queue-size
> > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > throttled by
> > > > > > io_thread_unit-based
> > > > > > > > > quota,
> > > > > > > > > > > or
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > > >> share
> > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Does the
> throttle-time
> > > in
> > > > > the
> > > > > > > > > > > > > ProduceResponse
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > time due to
> > > > > io_thread_unit-based
> > > > > > > > > quota?
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > - Currently kafka
> server
> > > > > doesn't
> > > > > > > not
> > > > > > > > > > > provide
> > > > > > > > > > > > > any
> > > > > > > > > > > > > > > log
> > > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > whether any given
> > clientId
> > > > (or
> > > > > > > user)
> > > > > > > > > is
> > > > > > > > > > > > > > throttled.
> > > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > because we can still
> > check
> > > > the
> > > > > > > > > > client-side
> > > > > > > > > > > > > > > byte-rate
> > > > > > > > > > > > > > > > > >> metric
> > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > whether a given client
> > is
> > > > > > > throttled.
> > > > > > > > > But
> > > > > > > > > > > > with
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > will be no way to
> > validate
> > > > > > > whether a
> > > > > > > > > > given
> > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> slow
> > > > > > > > > > > > > > > > > >> > >> > because
> > > > > > > > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > exceeded its
> > > io_thread_unit
> > > > > > limit.
> > > > > > > > It
> > > > > > > > > is
> > > > > > > > > > > > > > necessary
> > > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> user
> > > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > know this information
> to
> > > > > figure
> > > > > > > how
> > > > > > > > > > > whether
> > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > >> > reached
> > > > > > > > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > limit. How about we
> add
> > > > log4j
> > > > > > log
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > > > > server
> > > > > > > > > > > > > > > side
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > > > > > > > > byte-rate-throttle-time,
> > > > > > > > > > > > > > > > > >> > >> > >> >
> io-thread-unit-throttle-time)
> > > > > > > > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > that kafka
> administrator
> > > can
> > > > > > > figure
> > > > > > > > > > those
> > > > > > > > > > > > > users
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > >> have
> > > > > > > > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > limit and act
> > accordingly?
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017
> at
> > > 4:46
> > > > > PM,
> > > > > > > > > > Guozhang
> > > > > > > > > > > > > Wang <
> > > > > > > > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Made a pass over the
> > > doc,
> > > > > > > overall
> > > > > > > > > LGTM
> > > > > > > > > > > > > except
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > minor
> > > > > > > > > > > > > > > > > >> > >> comment
> > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttling
> > > implementation:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Stated as "Request
> > > > > processing
> > > > > > > time
> > > > > > > > > > > > > throttling
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > necessary." I
> thought
> > > that
> > > > > it
> > > > > > > > meant
> > > > > > > > > > the
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > >> > processing
> > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > is applied first,
> but
> > > > > continue
> > > > > > > > > > reading I
> > > > > > > > > > > > > found
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > >> > actually
> > > > > > > > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > produce / fetch byte
> > > rate
> > > > > > > > throttling
> > > > > > > > > > > > first.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Also the last
> sentence
> > > > "The
> > > > > > > > > remaining
> > > > > > > > > > > > delay
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > response." is a bit
> > > > > confusing
> > > > > > to
> > > > > > > > me.
> > > > > > > > > > > Maybe
> > > > > > > > > > > > > > > > rewording
> > > > > > > > > > > > > > > > > >> it a
> > > > > > > > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017
> > at
> > > > 3:24
> > > > > > PM,
> > > > > > > > Jun
> > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Thanks for the
> > updated
> > > > > KIP.
> > > > > > > The
> > > > > > > > > > latest
> > > > > > > > > > > > > > > proposal
> > > > > > > > > > > > > > > > > >> looks
> > > > > > > > > > > > > > > > > >> > >> good
> > > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22,
> 2017
> > > at
> > > > > 2:19
> > > > > > > PM,
> > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Thank you for
> the
> > > > > > feedback.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 1. I have
> updated
> > > the
> > > > > KIP
> > > > > > to
> > > > > > > > use
> > > > > > > > > > > > > absolute
> > > > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property is
> > called*
> > > > > > > > > > io_thread_units*
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > align
> > > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> > > > > *num.io.threads*.
> > > > > > > > When
> > > > > > > > > we
> > > > > > > > > > > > > > implement
> > > > > > > > > > > > > > > > > >> network
> > > > > > > > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas, we can
> add
> > > > > another
> > > > > > > > > > property
> > > > > > > > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 2.
> > > ControlledShutdown
> > > > is
> > > > > > > > already
> > > > > > > > > > > > listed
> > > > > > > > > > > > > > > under
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > you mean a
> > different
> > > > > > request
> > > > > > > > > that
> > > > > > > > > > > > needs
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > >> added?
> > > > > > > > > > > > > > > > > >> > >> The
> > > > > > > > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > currently exempt
> > in
> > > > the
> > > > > > KIP
> > > > > > > > are
> > > > > > > > > > > > > > StopReplica,
> > > > > > > > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and
> > > > > > > > UpdateMetadata.
> > > > > > > > > > > These
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > > >> controlled
> > > > > > > > > > > > > > > > > >> > >> > using
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is
> easy
> > > to
> > > > > > > exclude
> > > > > > > > > and
> > > > > > > > > > > only
> > > > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > sure if there
> are
> > > > other
> > > > > > > > requests
> > > > > > > > > > > used
> > > > > > > > > > > > > only
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was
> thinking
> > > the
> > > > > > > smallest
> > > > > > > > > > > change
> > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> replace
> > > > > > > > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > *requestChannel.sendResponse()
> > > > > > > > *
> > > > > > > > > > > with
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > local
> > > > > > > > > > > > > > > > > >> method
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > *sendResponseMaybeThrottle()*
> > > > > > > > > that
> > > > > > > > > > > > does
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > throttling
> > > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > response. If we
> > > > throttle
> > > > > > > first
> > > > > > > > > in
> > > > > > > > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > within the
> method
> > > > > handling
> > > > > > > the
> > > > > > > > > > > request
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > throttling. We
> can
> > > > look
> > > > > > into
> > > > > > > > > this
> > > > > > > > > > > > again
> > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> PR
> > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22,
> > 2017
> > > > at
> > > > > > 5:55
> > > > > > > > PM,
> > > > > > > > > > > Roger
> > > > > > > > > > > > > > > Hoover
> > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> roger.hoover@gmail.com>
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Great to see
> > this
> > > > KIP
> > > > > > and
> > > > > > > > the
> > > > > > > > > > > > > excellent
> > > > > > > > > > > > > > > > > >> discussion.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's
> > > > > suggestion
> > > > > > > > makes
> > > > > > > > > > > sense.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > > my
> > > > > > > > > > > > > > > > > >> > >> application
> > > > > > > > > > > > > > > > > >> > >> > is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> handler
> > > > unit,
> > > > > > then
> > > > > > > > > it's
> > > > > > > > > > as
> > > > > > > > > > > > if
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > > >> > Kafka
> > > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> handler
> > > > thread
> > > > > > > > > dedicated
> > > > > > > > > > > to
> > > > > > > > > > > > > me.
> > > > > > > > > > > > > > > > > That's
> > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > least.  That
> > > > > allocation
> > > > > > > > > doesn't
> > > > > > > > > > > > change
> > > > > > > > > > > > > > > even
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > >> an
> > > > > > > > > > > > > > > > > >> > >> admin
> > > > > > > > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > size of the
> > > request
> > > > > > thread
> > > > > > > > > pool
> > > > > > > > > > on
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > broker.
> > > > > > > > > > > > > > > > > >> > It's
> > > > > > > > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > abstraction
> that
> > > VMs
> > > > > and
> > > > > > > > > > > containers
> > > > > > > > > > > > > get
> > > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > While
> different
> > > > client
> > > > > > > > access
> > > > > > > > > > > > patterns
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > >> > wildly
> > > > > > > > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request thread
> > > > > resources
> > > > > > > per
> > > > > > > > > > > > request,
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > given
> > > > > > > > > > > > > > > > > >> > >> > application
> > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > have a stable
> > > access
> > > > > > > pattern
> > > > > > > > > and
> > > > > > > > > > > can
> > > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > "request
> thread
> > > > units"
> > > > > > it
> > > > > > > > > needs
> > > > > > > > > > to
> > > > > > > > > > > > > meet
> > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb
> 22,
> > > 2017
> > > > > at
> > > > > > > 8:53
> > > > > > > > > AM,
> > > > > > > > > > > Jun
> > > > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for
> the
> > > > > updated
> > > > > > > > KIP.
> > > > > > > > > A
> > > > > > > > > > > few
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > >> comments.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern
> > of
> > > > > > > > > > > > request_time_percent
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > >> it's
> > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > >> > >> > an
> > > > > > > > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Let's say
> you
> > > > give a
> > > > > > > user
> > > > > > > > a
> > > > > > > > > > 10%
> > > > > > > > > > > > > limit.
> > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > admin
> > > > > > > > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > handler
> > > > > > threads,
> > > > > > > > > that
> > > > > > > > > > > user
> > > > > > > > > > > > > now
> > > > > > > > > > > > > > > > > >> actually
> > > > > > > > > > > > > > > > > >> > has
> > > > > > > > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > capacity.
> This
> > > may
> > > > > > > confuse
> > > > > > > > > > > people
> > > > > > > > > > > > a
> > > > > > > > > > > > > > bit.
> > > > > > > > > > > > > > > > So,
> > > > > > > > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > based on an
> > > > absolute
> > > > > > > > request
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > unit
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> > better.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > > > > > > > ControlledShutdownRequest
> > > > > > > > > > is
> > > > > > > > > > > > also
> > > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be excluded
> > from
> > > > > > > > throttling.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 3.
> > > Implementation
> > > > > > wise,
> > > > > > > I
> > > > > > > > am
> > > > > > > > > > > > > wondering
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > time
> > throttling
> > > > > first
> > > > > > in
> > > > > > > > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> throttling
> > > > logic
> > > > > > in
> > > > > > > > each
> > > > > > > > > > > type
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > request.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb
> > 22,
> > > > 2017
> > > > > > at
> > > > > > > > 5:58
> > > > > > > > > > AM,
> > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank you
> > for
> > > > the
> > > > > > > > review.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I have
> > > reverted
> > > > to
> > > > > > the
> > > > > > > > > > > original
> > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> utilization.
> > > At
> > > > > the
> > > > > > > > > moment,
> > > > > > > > > > it
> > > > > > > > > > > > > uses
> > > > > > > > > > > > > > > > > >> percentage,
> > > > > > > > > > > > > > > > > >> > >> but
> > > > > > > > > > > > > > > > > >> > >> > I
> > > > > > > > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > a fraction
> > > (out
> > > > > of 1
> > > > > > > > > instead
> > > > > > > > > > > of
> > > > > > > > > > > > > 100)
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > from this
> > > > > discussion
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > KIP.
> > > > > > > > > > > > > > Also
> > > > > > > > > > > > > > > > > added
> > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > address
> > > network
> > > > > > thread
> > > > > > > > > > > > > utilization.
> > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > "request_time_percent"
> > > > > > > > > with
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > expectation
> > > > > > > > > > > > > > > > > >> > that
> > > > > > > > > > > > > > > > > >> > >> it
> > > > > > > > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit for
> > > > network
> > > > > > > thread
> > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > users have
> > to
> > > > set
> > > > > > only
> > > > > > > > one
> > > > > > > > > > > > config
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > > >> > >> > not
> > > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> internal
> > > > > > > > distribution
> > > > > > > > > of
> > > > > > > > > > > the
> > > > > > > > > > > > > > work
> > > > > > > > > > > > > > > > > >> between
> > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > >> > >> > two
> > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed,
> Feb
> > > 22,
> > > > > 2017
> > > > > > > at
> > > > > > > > > > 12:23
> > > > > > > > > > > > AM,
> > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > Rao
> > > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi,
> > Rajini,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks
> for
> > > the
> > > > > > > > proposal.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > The
> > benefit
> > > of
> > > > > > using
> > > > > > > > the
> > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > >> processing
> > > > > > > > > > > > > > > > > >> > >> time
> > > > > > > > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > exactly
> > what
> > > > > > people
> > > > > > > > have
> > > > > > > > > > > > said. I
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > >> > >> expand
> > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> following
> > > > case.
> > > > > > The
> > > > > > > > > > producer
> > > > > > > > > > > > > > sends a
> > > > > > > > > > > > > > > > > >> produce
> > > > > > > > > > > > > > > > > >> > >> > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > but
> > > compressed
> > > > > to
> > > > > > > > 100KB
> > > > > > > > > > with
> > > > > > > > > > > > > gzip.
> > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > broker
> > could
> > > > > take
> > > > > > > > 10-15
> > > > > > > > > > > > seconds,
> > > > > > > > > > > > > > > > during
> > > > > > > > > > > > > > > > > >> which
> > > > > > > > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > thread
> is
> > > > > > completely
> > > > > > > > > > > blocked.
> > > > > > > > > > > > In
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > >> case,
> > > > > > > > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > request
> > > > rate
> > > > > > > quota
> > > > > > > > > may
> > > > > > > > > > > be
> > > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > another
> > > case.
> > > > A
> > > > > > > > consumer
> > > > > > > > > > > group
> > > > > > > > > > > > > > > starts
> > > > > > > > > > > > > > > > > >> with 10
> > > > > > > > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > switches
> > to
> > > 20
> > > > > > > > > instances.
> > > > > > > > > > > The
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > >> > will
> > > > > > > > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > actually
> > > load
> > > > on
> > > > > > the
> > > > > > > > > > broker
> > > > > > > > > > > > may
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > double
> > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > contains
> > > half
> > > > of
> > > > > > the
> > > > > > > > > > > > partitions.
> > > > > > > > > > > > > > > > Request
> > > > > > > > > > > > > > > > > >> rate
> > > > > > > > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> configure
> > in
> > > > > this
> > > > > > > > case.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > What we
> > > really
> > > > > > want
> > > > > > > is
> > > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > > able
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > of the
> > > server
> > > > > side
> > > > > > > > > > > resources.
> > > > > > > > > > > > In
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > >> > >> particular
> > > > > > > > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > capacity
> > of
> > > > the
> > > > > > > > request
> > > > > > > > > > > > handler
> > > > > > > > > > > > > > > > > threads. I
> > > > > > > > > > > > > > > > > >> > >> agree
> > > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> intuitive
> > > for
> > > > > the
> > > > > > > > users
> > > > > > > > > to
> > > > > > > > > > > > > > determine
> > > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > >> > set
> > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > this is
> > not
> > > > > > > completely
> > > > > > > > > new
> > > > > > > > > > > and
> > > > > > > > > > > > > has
> > > > > > > > > > > > > > > > been
> > > > > > > > > > > > > > > > > >> done
> > > > > > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > already.
> > For
> > > > > > > example,
> > > > > > > > > > Linux
> > > > > > > > > > > > > > cgroup (
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > https://access.redhat.com/
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > Resource_Management_Guide/sec-
> > > > > > > > > > > > > > > > cpu.html)
> > > > > > > > > > > > > > > > > >> has
> > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > cpu.cfs_quota_us,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > which
> > > > specifies
> > > > > > the
> > > > > > > > > total
> > > > > > > > > > > > amount
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks
> in a
> > > > > cgroup
> > > > > > > can
> > > > > > > > > run
> > > > > > > > > > > > > during a
> > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > >> second
> > > > > > > > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > model
> the
> > > > > request
> > > > > > > > > handler
> > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > > > >> > similar
> > > > > > > > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > request
> > > > handler
> > > > > > > thread
> > > > > > > > > can
> > > > > > > > > > > be
> > > > > > > > > > > > 1
> > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > >> > handler
> > > > > > > > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> configure
> > a
> > > > > limit
> > > > > > on
> > > > > > > > how
> > > > > > > > > > > many
> > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > (say
> > > > > > > > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > > > > > > > >> > >> a
> > > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> Regarding
> > > not
> > > > > > > > throttling
> > > > > > > > > > the
> > > > > > > > > > > > > > > internal
> > > > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > do that.
> > > > > > > > Alternatively,
> > > > > > > > > we
> > > > > > > > > > > > could
> > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > > let
> > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for the
> > > kafka
> > > > > user
> > > > > > > (it
> > > > > > > > > may
> > > > > > > > > > > not
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> do
> > > > > > > > > > > > > > > > > >> > >> that
> > > > > > > > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally
> we
> > > > want
> > > > > to
> > > > > > > be
> > > > > > > > > able
> > > > > > > > > > > to
> > > > > > > > > > > > > > > protect
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool
> too.
> > > The
> > > > > > > > difficult
> > > > > > > > > is
> > > > > > > > > > > > > mostly
> > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > >> Rajini
> > > > > > > > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> throttling
> > > the
> > > > > > > > requests
> > > > > > > > > is
> > > > > > > > > > > > > through
> > > > > > > > > > > > > > > > > >> Purgatory
> > > > > > > > > > > > > > > > > >> > >> and
> > > > > > > > > > > > > > > > > >> > >> > we
> > > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > through
> > how
> > > to
> > > > > > > > integrate
> > > > > > > > > > > that
> > > > > > > > > > > > > into
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > network
> > > > > > > > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > layer,
> > > > currently
> > > > > > we
> > > > > > > > know
> > > > > > > > > > the
> > > > > > > > > > > > > user,
> > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a
> bit
> > > > > tricky
> > > > > > to
> > > > > > > > > > > throttle
> > > > > > > > > > > > > > based
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > >> > clientId
> > > > > > > > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> can
> > > > > already
> > > > > > > > > protect
> > > > > > > > > > > the
> > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > >> thread
> > > > > > > > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> requests.
> > > So,
> > > > if
> > > > > > we
> > > > > > > > > can't
> > > > > > > > > > > > figure
> > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > >> > part
> > > > > > > > > > > > > > > > > >> > >> > right
> > > > > > > > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > request
> > > > > > handling
> > > > > > > > > > threads
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue,
> > Feb
> > > > 21,
> > > > > > 2017
> > > > > > > > at
> > > > > > > > > > 4:27
> > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank
> > you
> > > > all
> > > > > > for
> > > > > > > > the
> > > > > > > > > > > > > feedback.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I
> > > have
> > > > > > > removed
> > > > > > > > > > > > exemption
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> consumer
> > > > > > > > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > protecting
> > > > the
> > > > > > > > cluster
> > > > > > > > > > is
> > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > important
> > > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have
> > > > retained
> > > > > > the
> > > > > > > > > > > exemption
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > throttled
> > > > only
> > > > > > if
> > > > > > > > > > > > > authorization
> > > > > > > > > > > > > > > > fails
> > > > > > > > > > > > > > > > > >> (so
> > > > > > > > > > > > > > > > > >> > >> can't
> > > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a
> secure
> > > > > > cluster,
> > > > > > > > but
> > > > > > > > > > > allows
> > > > > > > > > > > > > > > > > >> inter-broker
> > > > > > > > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> delays).
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I will
> > > wait
> > > > > > > another
> > > > > > > > > day
> > > > > > > > > > to
> > > > > > > > > > > > see
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > these
> > > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > > >> > >> any
> > > > > > > > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> request
> > > > > > processing
> > > > > > > > > time
> > > > > > > > > > > (as
> > > > > > > > > > > > > > > opposed
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> request
> > > > > > > > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > objections,
> > > > I
> > > > > > will
> > > > > > > > > > revert
> > > > > > > > > > > to
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> original
> > > > > > > > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > The
> > > original
> > > > > > > > proposal
> > > > > > > > > > was
> > > > > > > > > > > > only
> > > > > > > > > > > > > > > > > including
> > > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> handler
> > > > > threads
> > > > > > > > (that
> > > > > > > > > > made
> > > > > > > > > > > > > > > > calculation
> > > > > > > > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> include
> > > the
> > > > > time
> > > > > > > > spent
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > significant.
> > > > > As
> > > > > > > Jay
> > > > > > > > > > > pointed
> > > > > > > > > > > > > out,
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > total
> > > > > available
> > > > > > > CPU
> > > > > > > > > time
> > > > > > > > > > > and
> > > > > > > > > > > > > > > convert
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and
> *n*
> > > > > network
> > > > > > > > > threads.
> > > > > > > > > > > > > > > > > >> > >> > >> >
> ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we
> want,
> > > but
> > > > > it
> > > > > > > can
> > > > > > > > be
> > > > > > > > > > > very
> > > > > > > > > > > > > > > > expensive
> > > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > >> > some
> > > > > > > > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> Guozhang
> > > > have
> > > > > > > > pointed
> > > > > > > > > > out,
> > > > > > > > > > > > we
> > > > > > > > > > > > > do
> > > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > > >> > several
> > > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > generating
> > > > > > metrics
> > > > > > > > > that
> > > > > > > > > > we
> > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > use,
> > > > > > > > > > > > > > > > > >> > though
> > > > > > > > > > > > > > > > > >> > >> we
> > > > > > > > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > nanoTime()
> > > > > > instead
> > > > > > > > of
> > > > > > > > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > small
> > > > requests
> > > > > > may
> > > > > > > > be
> > > > > > > > > <
> > > > > > > > > > > 1ms.
> > > > > > > > > > > > > But
> > > > > > > > > > > > > > > > > rather
> > > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > > >> > >> add
> > > > > > > > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thread
> > and
> > > > > > network
> > > > > > > > > > thread,
> > > > > > > > > > > > > > > wouldn't
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > > >> > >> better
> > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on
> each
> > > > thread
> > > > > > > into
> > > > > > > > a
> > > > > > > > > > > > separate
> > > > > > > > > > > > > > > > ratio?
> > > > > > > > > > > > > > > > > >> UserA
> > > > > > > > > > > > > > > > > >> > >> has
> > > > > > > > > > > > > > > > > >> > >> > a
> > > > > > > > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we
> take
> > > that
> > > > > to
> > > > > > > mean
> > > > > > > > > > that
> > > > > > > > > > > > > UserA
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > > >> 5%
> > > > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5%
> > of
> > > > the
> > > > > > time
> > > > > > > > on
> > > > > > > > > > I/O
> > > > > > > > > > > > > > threads?
> > > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > > >> > either
> > > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > throttled
> > > -
> > > > it
> > > > > > > would
> > > > > > > > > > mean
> > > > > > > > > > > > > > > > maintaining
> > > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > > >> > >> sets
> > > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > durations,
> > > > but
> > > > > > > would
> > > > > > > > > > > result
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quota
> > > limits
> > > > > > > (UserA
> > > > > > > > > has
> > > > > > > > > > 5%
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > >> > threads
> > > > > > > > > > > > > > > > > >> > >> > and
> > > > > > > > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> threads),
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > but
> that
> > > > seems
> > > > > > > > > > unnecessary
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > harder
> > > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> explain
> > > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back
> to
> > > why
> > > > > and
> > > > > > > how
> > > > > > > > > > quotas
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > > applied
> > > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > > >> > >> > network
> > > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In
> > the
> > > > case
> > > > > > of
> > > > > > > > > fetch,
> > > > > > > > > > > > the
> > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > >> spent in
> > > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > significant
> > > > > and
> > > > > > I
> > > > > > > > can
> > > > > > > > > > see
> > > > > > > > > > > > the
> > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > include
> > > > > > > > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> requests
> > > > where
> > > > > > the
> > > > > > > > > > network
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of
> > fetch,
> > > > > > request
> > > > > > > > > > handler
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > >> > utilization
> > > > > > > > > > > > > > > > > >> > >> > would
> > > > > > > > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > high
> > > request
> > > > > > rate,
> > > > > > > > low
> > > > > > > > > > > data
> > > > > > > > > > > > > > volume
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > >> > fetch
> > > > > > > > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> clients
> > > with
> > > > > > high
> > > > > > > > data
> > > > > > > > > > > > volume.
> > > > > > > > > > > > > > > > Network
> > > > > > > > > > > > > > > > > >> > thread
> > > > > > > > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > proportional
> > > > > to
> > > > > > > the
> > > > > > > > > data
> > > > > > > > > > > > > > volume. I
> > > > > > > > > > > > > > > > am
> > > > > > > > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > based
> on
> > > > > network
> > > > > > > > > thread
> > > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > >> > >> whether
> > > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > this
> > case.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At
> > the
> > > > > > moment,
> > > > > > > we
> > > > > > > > > > > record
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > > >> > >> quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a
> > quota
> > > > is
> > > > > > > > > violated,
> > > > > > > > > > > the
> > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk
> > reads
> > > > for
> > > > > > > > fetches
> > > > > > > > > > > > > happening
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> network
> > > > > > > > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delay
> a
> > > > > response
> > > > > > > > after
> > > > > > > > > > the
> > > > > > > > > > > > > disk
> > > > > > > > > > > > > > > > reads.
> > > > > > > > > > > > > > > > > >> We
> > > > > > > > > > > > > > > > > >> > >> could
> > > > > > > > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > the
> > > network
> > > > > > thread
> > > > > > > > > when
> > > > > > > > > > > the
> > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> > >> complete
> > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > handling a
> > > > > > > > subsequent
> > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > (separate
> > > > > > > > > > > > > > > > > >> out
> > > > > > > > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> handling
> > > in
> > > > > the
> > > > > > > case
> > > > > > > > > of
> > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> Regards,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On
> Tue,
> > > Feb
> > > > > 21,
> > > > > > > 2017
> > > > > > > > > at
> > > > > > > > > > > 2:58
> > > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > > Becket
> > > > > > > > > > > > > > > > > >> > Qin <
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > becket.qin@gmail.com>
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey
> > Jay,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> Yeah,
> > I
> > > > > agree
> > > > > > > that
> > > > > > > > > > > > enforcing
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > > >> > >> is a
> > > > > > > > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> thinking
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > that
> > > maybe
> > > > > we
> > > > > > > can
> > > > > > > > > use
> > > > > > > > > > > the
> > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > > >> > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > very
> > > > > detailed
> > > > > > so
> > > > > > > > we
> > > > > > > > > > can
> > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > something
> > > > > like
> > > > > > > > > > > > (total_time -
> > > > > > > > > > > > > > > > > >> > >> > >> > > >
> > request/response_queue_time
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > remote_time).
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I
> > agree
> > > > with
> > > > > > > > > Guozhang
> > > > > > > > > > > that
> > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > > >> user is
> > > > > > > > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > need
> > to
> > > > see
> > > > > if
> > > > > > > > > > anything
> > > > > > > > > > > > has
> > > > > > > > > > > > > > went
> > > > > > > > > > > > > > > > > wrong
> > > > > > > > > > > > > > > > > >> > >> first,
> > > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > behaving
> > > > and
> > > > > > > just
> > > > > > > > > need
> > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > >> resources, we
> > > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for
> > > them.
> > > > It
> > > > > > is
> > > > > > > > true
> > > > > > > > > > > that
> > > > > > > > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> users
> > is
> > > > > > > > difficult.
> > > > > > > > > So
> > > > > > > > > > > in
> > > > > > > > > > > > > > > practice
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > >> > would
> > > > > > > > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a
> > > relative
> > > > > > high
> > > > > > > > > > > protective
> > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > >> quota
> > > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for
> > some
> > > > > > > > individual
> > > > > > > > > > > > clients
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > > demand.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > Thanks,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > Jiangjie
> > > > > > > (Becket)
> > > > > > > > > Qin
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On
> > Mon,
> > > > Feb
> > > > > > 20,
> > > > > > > > 2017
> > > > > > > > > > at
> > > > > > > > > > > > 5:48
> > > > > > > > > > > > > > PM,
> > > > > > > > > > > > > > > > > >> Guozhang
> > > > > > > > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > wangguoz@gmail.com
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> This
> > > is
> > > > a
> > > > > > > great
> > > > > > > > > > > > proposal,
> > > > > > > > > > > > > > glad
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> see
> > > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I
> am
> > > > > > inclined
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > > >> throttling, or
> > > > > > > > > > > > > > > > > >> > >> more
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > ratio
> > > > > > instead
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > summed
> > > > my
> > > > > > > > > rationales
> > > > > > > > > > > > > above,
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> has
> > a
> > > > good
> > > > > > > > support
> > > > > > > > > > for
> > > > > > > > > > > > > both
> > > > > > > > > > > > > > > > > >> "protecting
> > > > > > > > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > "utilizing a
> > > > > > > > > cluster
> > > > > > > > > > > for
> > > > > > > > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > explain
> > > > > this
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > end
> > > > > > > > > > > > > > > users, I
> > > > > > > > > > > > > > > > > >> find
> > > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > request
> > > > > rate
> > > > > > > > since
> > > > > > > > > > as
> > > > > > > > > > > > > > > mentioned
> > > > > > > > > > > > > > > > > >> above,
> > > > > > > > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > different
> > > > > > > > "cost",
> > > > > > > > > > and
> > > > > > > > > > > > > Kafka
> > > > > > > > > > > > > > > > today
> > > > > > > > > > > > > > > > > >> > already
> > > > > > > > > > > > > > > > > >> > >> > have
> > > > > > > > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > (produce,
> > > > > > > fetch,
> > > > > > > > > > > admin,
> > > > > > > > > > > > > > > > metadata,
> > > > > > > > > > > > > > > > > >> etc),
> > > > > > > > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > throttling
> > > > > > may
> > > > > > > > not
> > > > > > > > > > be
> > > > > > > > > > > as
> > > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > conservatively.
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > Regarding
> > > > > to
> > > > > > > > user
> > > > > > > > > > > > > reactions
> > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > > >> they
> > > > > > > > > > > > > > > > > >> > are
> > > > > > > > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> differ
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > case-by-case,
> > > > > > > > and
> > > > > > > > > > need
> > > > > > > > > > > > to
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > >> > discovered /
> > > > > > > > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > metrics.
> > > > > So
> > > > > > in
> > > > > > > > > other
> > > > > > > > > > > > words
> > > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > > >> would
> > > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > information
> > > > > > by
> > > > > > > > > > simply
> > > > > > > > > > > > > being
> > > > > > > > > > > > > > > told
> > > > > > > > > > > > > > > > > >> "hey,
> > > > > > > > > > > > > > > > > >> > >> you
> > > > > > > > > > > > > > > > > >> > >> > are
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> what
> > > > > > > throttling
> > > > > > > > > > does;
> > > > > > > > > > > > they
> > > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > >> > take a
> > > > > > > > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > throttled
> > > > > > > > probably
> > > > > > > > > > > > because
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > ..",
> > > > > > > > > > > > > > > > > >> > which
> > > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > > >> > >> > by
> > > > > > > > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > values:
> > > > > e.g.
> > > > > > > > > whether
> > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > > bombarding
> > > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > [Message clipped]
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > *Todd Palino*
> > > > > > > > Staff Site Reliability Engineer
> > > > > > > > Data Infrastructure Streaming
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > linkedin.com/in/toddpalino
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Todd Palino*
> > > > > > Staff Site Reliability Engineer
> > > > > > Data Infrastructure Streaming
> > > > > >
> > > > > >
> > > > > >
> > > > > > linkedin.com/in/toddpalino
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Jun,

40. Yes you are right, a single value tracking the total exempt time is
sufficient. Have updated the KIP.

Thank you,

Rajini

On Thu, Mar 9, 2017 at 9:42 PM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Rajini,
>
> The updated KIP looks good. Just one more comment.
>
> 40. "An additional metric exempt-request-time will also be added for each
> quota entity for the quota type Request." Should that metric be added for
> each entity type (e.g., user, client-id, etc)? It seems that value is
> independent of entity types.
>
> Thanks,
>
> Jun
>
> On Thu, Mar 9, 2017 at 12:07 PM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Hi Jun,
> >
> > Thank you for reviewing the KIP again.
> >
> > 30. That is a good idea. In fact, it is one of the advantages of
> measuring
> > overall utilization rather than separate values for network and I/O
> threads
> > as I had intended initially. Have updated the KIP, thanks.
> >
> > 31. Added exempt-request-time metric.
> >
> > 32. I had thought of using quota.window.size.seconds * quota.window.num
> > initially, but felt that would be too big. Even the default of 11 seconds
> > is a rather long time to be throttled. With a limit of
> > quota.window.size.seconds, subsequent requests for that total interval of
> > the samples will also each be throttled for quota.window.size.seconds if
> > the time recorded was very high. So limiting at quota.window.size.seconds
> > limits the throttle time for an individual request, avoiding timeouts
> where
> > possible, but still throttles over a period of time.
> >
> > 33. Updated to use request_percentage.
> >
> >
> > On Thu, Mar 9, 2017 at 5:40 PM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Rajini,
> > >
> > > Thanks for the updated KIP. A few more comments.
> > >
> > > 30. Should we just account for the time in network threads in this KIP
> > too?
> > > The issue with doing this later is that existing quotas may be too
> small
> > > and everyone will have to adjust them before upgrading, which is
> > > inconvenient. If we just do the delaying in the io threads, there
> > probably
> > > isn't too much additional work to include the network thread time?
> > >
> > > 31. It would be useful for the new metrics to capture the utilization
> of
> > > all those requests exempt from request throttling (under sth like
> > > "exempt"). It's useful for an admin to know how much time is spent
> there
> > > too.
> > >
> > > 32. "The maximum throttle time for any single request will be the quota
> > > window size (one second by default)." We probably should cap the delay
> at
> > > quota.window.size.seconds * quota.window.num?
> > >
> > > 33. It's unfortunate that we use . in configs and _ in ZK data
> > structures.
> > > However, for consistency, request.percentage in ZK probably should be
> > > request_percentage?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Thu, Mar 9, 2017 at 7:55 AM, Rajini Sivaram <
> rajinisivaram@gmail.com>
> > > wrote:
> > >
> > > > I have updated the KIP to use "request.percentage" quotas where the
> > > > percentage is out of a total of (num.io.threads * 100). I have added
> > the
> > > > other options considered so far under "Rejected Alternatives".
> > > >
> > > > To address Todd's concern about per-thread quotas: Even though the
> > quotas
> > > > are out of (num.io.threads * 100)  clients are not locked into
> threads.
> > > > Utilization is measured as the total across all the I/O threads and
> 10
> > %
> > > > quota can be 1% of 10 threads. Individual quotas can also be greater
> > than
> > > > 100% if required.
> > > >
> > > > Please let me know if there are any other concerns or suggestions.
> > > >
> > > > Thank you,
> > > >
> > > > Rajini
> > > >
> > > > On Wed, Mar 8, 2017 at 10:20 PM, Todd Palino <tp...@gmail.com>
> > wrote:
> > > >
> > > > > Rajini -
> > > > >
> > > > > I understand what you’re saying, but the point I’m making is that I
> > > don’t
> > > > > believe we need to take it into account directly. The CPU
> utilization
> > > of
> > > > > the network threads is directly proportional to the number of bytes
> > > being
> > > > > sent. The more bytes, the more CPU that is required for SSL (or
> other
> > > > > tasks). This is opposed to the request handler threads, where there
> > > are a
> > > > > number of factors that affect CPU utilization. This means that it’s
> > not
> > > > > necessary to separately quota network thread byte usage and CPU -
> if
> > we
> > > > > quota byte usage (which we already do), we have fixed the CPU usage
> > at
> > > a
> > > > > proportional amount.
> > > > >
> > > > > Jun -
> > > > >
> > > > > Thanks for the clarification there. I was thinking of the
> utilization
> > > > > percentage as being fixed, not what the percentage reflects. I’m
> not
> > > tied
> > > > > to either way of doing it, provided that we do not lock clients to
> a
> > > > single
> > > > > thread. For example, if I specify that a given client can use 10%
> of
> > a
> > > > > single thread, that should also mean they can use 1% on 10 threads.
> > > > >
> > > > > -Todd
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <ju...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Todd,
> > > > > >
> > > > > > Thanks for the feedback.
> > > > > >
> > > > > > I just want to clarify your second point. If the limit percentage
> > is
> > > > per
> > > > > > thread and the thread counts are changed, the absolute processing
> > > limit
> > > > > for
> > > > > > existing users haven't changed and there is no need to adjust
> them.
> > > On
> > > > > the
> > > > > > other hand, if the limit percentage is of total thread pool
> > capacity
> > > > and
> > > > > > the thread counts are changed, the effective processing limit
> for a
> > > > user
> > > > > > will change. So, to preserve the current processing limit,
> existing
> > > > user
> > > > > > limits have to be adjusted. If there is a hardware change, the
> > > > effective
> > > > > > processing limit for a user will change in either approach and
> the
> > > > > existing
> > > > > > limit may need to be adjusted. However, hardware changes are less
> > > > common
> > > > > > than thread pool configuration changes.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <tp...@gmail.com>
> > > wrote:
> > > > > >
> > > > > > > I’ve been following this one on and off, and overall it sounds
> > good
> > > > to
> > > > > > me.
> > > > > > >
> > > > > > > - The SSL question is a good one. However, that type of
> overhead
> > > > should
> > > > > > be
> > > > > > > proportional to the bytes rate, so I think that a bytes rate
> > quota
> > > > > would
> > > > > > > still be a suitable way to address it.
> > > > > > >
> > > > > > > - I think it’s better to make the quota percentage of total
> > thread
> > > > pool
> > > > > > > capacity, and not percentage of an individual thread. That way
> > you
> > > > > don’t
> > > > > > > have to adjust it when you adjust thread counts (tuning,
> hardware
> > > > > > changes,
> > > > > > > etc.)
> > > > > > >
> > > > > > >
> > > > > > > -Todd
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <
> becket.qin@gmail.com
> > >
> > > > > wrote:
> > > > > > >
> > > > > > > > I see. Good point about SSL.
> > > > > > > >
> > > > > > > > I just asked Todd to take a look.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io>
> > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Jiangjie,
> > > > > > > > >
> > > > > > > > > Yes, I agree that byte rate already protects the network
> > > threads
> > > > > > > > > indirectly. I am not sure if byte rate fully captures the
> CPU
> > > > > > overhead
> > > > > > > in
> > > > > > > > > network due to SSL. So, at the high level, we can use
> request
> > > > time
> > > > > > > limit
> > > > > > > > to
> > > > > > > > > protect CPU and use byte rate to protect storage and
> network.
> > > > > > > > >
> > > > > > > > > Also, do you think you can get Todd to comment on this KIP?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <
> > > > becket.qin@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi Rajini/Jun,
> > > > > > > > > >
> > > > > > > > > > The percentage based reasoning sounds good.
> > > > > > > > > > One thing I am wondering is that if we assume the network
> > > > thread
> > > > > > are
> > > > > > > > just
> > > > > > > > > > doing the network IO, can we say bytes rate quota is
> > already
> > > > sort
> > > > > > of
> > > > > > > > > > network threads quota?
> > > > > > > > > > If we take network threads into the consideration here,
> > would
> > > > > that
> > > > > > be
> > > > > > > > > > somewhat overlapping with the bytes rate quota?
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > >
> > > > > > > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Jun,
> > > > > > > > > > >
> > > > > > > > > > > Thank you for the explanation, I hadn't realized you
> > meant
> > > > > > > percentage
> > > > > > > > > of
> > > > > > > > > > > the total thread pool. If everyone is OK with Jun's
> > > > > suggestion, I
> > > > > > > > will
> > > > > > > > > > > update the KIP.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Rajini
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <
> > jun@confluent.io>
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > >
> > > > > > > > > > > > Let's take your example. Let's say a user sets the
> > limit
> > > to
> > > > > > 50%.
> > > > > > > I
> > > > > > > > am
> > > > > > > > > > not
> > > > > > > > > > > > sure if it's better to apply the same percentage
> > > separately
> > > > > to
> > > > > > > > > network
> > > > > > > > > > > and
> > > > > > > > > > > > io thread pool. For example, for produce requests,
> most
> > > of
> > > > > the
> > > > > > > time
> > > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > spent in the io threads whereas for fetch requests,
> > most
> > > of
> > > > > the
> > > > > > > > time
> > > > > > > > > > will
> > > > > > > > > > > > be in the network threads. So, using the same
> > percentage
> > > in
> > > > > > both
> > > > > > > > > thread
> > > > > > > > > > > > pools means one of the pools' resource will be over
> > > > > allocated.
> > > > > > > > > > > >
> > > > > > > > > > > > An alternative way is to simply model network and io
> > > thread
> > > > > > pool
> > > > > > > > > > > together.
> > > > > > > > > > > > If you get 10 io threads and 5 network threads, you
> get
> > > > 1500%
> > > > > > > > request
> > > > > > > > > > > > processing power. A 50% limit means a total of 750%
> > > > > processing
> > > > > > > > power.
> > > > > > > > > > We
> > > > > > > > > > > > just add up the time a user request spent in either
> > > network
> > > > > or
> > > > > > io
> > > > > > > > > > thread.
> > > > > > > > > > > > If that total exceeds 750% (doesn't matter whether
> it's
> > > > spent
> > > > > > > more
> > > > > > > > in
> > > > > > > > > > > > network or io thread), the request will be throttled.
> > > This
> > > > > > seems
> > > > > > > > more
> > > > > > > > > > > > general and is not sensitive to the current
> > > implementation
> > > > > > detail
> > > > > > > > of
> > > > > > > > > > > having
> > > > > > > > > > > > a separate network and io thread pool. In the future,
> > if
> > > > the
> > > > > > > > > threading
> > > > > > > > > > > > model changes, the same concept of quota can still be
> > > > > applied.
> > > > > > > For
> > > > > > > > > now,
> > > > > > > > > > > > since it's a bit tricky to add the delay logic in the
> > > > network
> > > > > > > > thread
> > > > > > > > > > > pool,
> > > > > > > > > > > > we could probably just do the delaying only in the io
> > > > threads
> > > > > > as
> > > > > > > > you
> > > > > > > > > > > > suggested earlier.
> > > > > > > > > > > >
> > > > > > > > > > > > There is still the orthogonal question of whether a
> > quota
> > > > of
> > > > > > 50%
> > > > > > > is
> > > > > > > > > out
> > > > > > > > > > > of
> > > > > > > > > > > > 100% or 100% * #total processing threads. My feeling
> is
> > > > that
> > > > > > the
> > > > > > > > > latter
> > > > > > > > > > > is
> > > > > > > > > > > > slightly better based on my explanation earlier. The
> > way
> > > to
> > > > > > > > describe
> > > > > > > > > > this
> > > > > > > > > > > > quota to the users can be "share of elapsed request
> > > > > processing
> > > > > > > time
> > > > > > > > > on
> > > > > > > > > > a
> > > > > > > > > > > > single CPU" (similar to top).
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jun
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Jun,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Agree about the two scenarios.
> > > > > > > > > > > > >
> > > > > > > > > > > > > But still not sure about a single quota covering
> both
> > > > > network
> > > > > > > > > threads
> > > > > > > > > > > and
> > > > > > > > > > > > > I/O threads with per-thread quota. If there are 10
> > I/O
> > > > > > threads
> > > > > > > > and
> > > > > > > > > 5
> > > > > > > > > > > > > network threads and I want to assign half the quota
> > to
> > > > > userA,
> > > > > > > the
> > > > > > > > > > quota
> > > > > > > > > > > > > would be 750%. I imagine, internally, we would
> > convert
> > > > this
> > > > > > to
> > > > > > > > 500%
> > > > > > > > > > for
> > > > > > > > > > > > I/O
> > > > > > > > > > > > > and 250% for network threads to allocate 50% of
> each
> > > > pool.
> > > > > > > > > > > > >
> > > > > > > > > > > > > A couple of scenarios:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. Admin adds 1 extra network thread. To retain
> 50%,
> > > > admin
> > > > > > > needs
> > > > > > > > to
> > > > > > > > > > now
> > > > > > > > > > > > > allocate 800% for each user. Or increase the quota
> > for
> > > a
> > > > > few
> > > > > > > > users.
> > > > > > > > > > To
> > > > > > > > > > > > me,
> > > > > > > > > > > > > it feels like admin needs to convert 50% to 800%
> and
> > > > Kafka
> > > > > > > > > internally
> > > > > > > > > > > > needs
> > > > > > > > > > > > > to convert 800% to (500%, 300%). Everyone using
> just
> > > 50%
> > > > > > feels
> > > > > > > a
> > > > > > > > > lot
> > > > > > > > > > > > > simpler.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. We decide to add some other thread to this list.
> > > Admin
> > > > > > needs
> > > > > > > > to
> > > > > > > > > > know
> > > > > > > > > > > > > exactly how many threads form the maximum quota.
> And
> > we
> > > > can
> > > > > > be
> > > > > > > > > > changing
> > > > > > > > > > > > > this between broker versions as we add more to the
> > > list.
> > > > > > Again
> > > > > > > a
> > > > > > > > > > single
> > > > > > > > > > > > > overall percent would be a lot simpler.
> > > > > > > > > > > > >
> > > > > > > > > > > > > There were others who were unconvinced by a single
> > > > percent
> > > > > > from
> > > > > > > > the
> > > > > > > > > > > > initial
> > > > > > > > > > > > > proposal and were happier with thread units similar
> > to
> > > > CPU
> > > > > > > units,
> > > > > > > > > so
> > > > > > > > > > I
> > > > > > > > > > > am
> > > > > > > > > > > > > ok with going with per-thread quotas (as units or
> > > > percent).
> > > > > > > Just
> > > > > > > > > not
> > > > > > > > > > > sure
> > > > > > > > > > > > > it makes it easier for admin in all cases.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <
> > > > jun@confluent.io>
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Consider modeling as n * 100% unit. For 2), the
> > > > question
> > > > > is
> > > > > > > > > what's
> > > > > > > > > > > > > causing
> > > > > > > > > > > > > > the I/O threads to be saturated. It's unlikely
> that
> > > all
> > > > > > > users'
> > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > have increased at the same. A more likely case is
> > > that
> > > > a
> > > > > > few
> > > > > > > > > > isolated
> > > > > > > > > > > > > > users' utilization have increased. If so, after
> > > > > increasing
> > > > > > > the
> > > > > > > > > > number
> > > > > > > > > > > > of
> > > > > > > > > > > > > > threads, the admin just needs to adjust the quota
> > > for a
> > > > > few
> > > > > > > > > > isolated
> > > > > > > > > > > > > users,
> > > > > > > > > > > > > > which is expected and is less work.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Consider modeling as 1 * 100% unit. For 1), all
> > > users'
> > > > > > quota
> > > > > > > > need
> > > > > > > > > > to
> > > > > > > > > > > be
> > > > > > > > > > > > > > adjusted, which is unexpected and is more work.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So, to me, the n * 100% model seems more
> > convenient.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > As for future extension to cover network thread
> > > > > > utilization,
> > > > > > > I
> > > > > > > > > was
> > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > that one way is to simply model the capacity as
> (n
> > +
> > > > m) *
> > > > > > > 100%
> > > > > > > > > > unit,
> > > > > > > > > > > > > where
> > > > > > > > > > > > > > n and m are the number of network and i/o
> threads,
> > > > > > > > respectively.
> > > > > > > > > > > Then,
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > each user, we can just add up the utilization in
> > the
> > > > > > network
> > > > > > > > and
> > > > > > > > > > the
> > > > > > > > > > > > i/o
> > > > > > > > > > > > > > thread. If we do this, we don't need a new type
> of
> > > > quota.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If we use request.percentage as the percentage
> > used
> > > > in
> > > > > a
> > > > > > > > single
> > > > > > > > > > I/O
> > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > the total percentage being allocated will be
> > > > > > > num.io.threads *
> > > > > > > > > 100
> > > > > > > > > > > for
> > > > > > > > > > > > > I/O
> > > > > > > > > > > > > > > threads and num.network.threads * 100 for
> network
> > > > > > threads.
> > > > > > > A
> > > > > > > > > > single
> > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > covering the two as a percentage wouldn't quite
> > > work
> > > > if
> > > > > > you
> > > > > > > > > want
> > > > > > > > > > to
> > > > > > > > > > > > > > > allocate the same proportion in both cases. If
> we
> > > > want
> > > > > to
> > > > > > > > treat
> > > > > > > > > > > > threads
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > separate units, won't we need two quota
> > > > configurations
> > > > > > > > > regardless
> > > > > > > > > > > of
> > > > > > > > > > > > > > > whether we use units or percentage? Perhaps I
> > > > > > misunderstood
> > > > > > > > > your
> > > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think there are two cases:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >    1. The use case that you mentioned where an
> > > admin
> > > > is
> > > > > > > > adding
> > > > > > > > > > more
> > > > > > > > > > > > > users
> > > > > > > > > > > > > > >    and decides to add more I/O threads and
> > expects
> > > to
> > > > > > find
> > > > > > > > free
> > > > > > > > > > > quota
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > >    allocate for new users.
> > > > > > > > > > > > > > >    2. Admin adds more I/O threads because the
> I/O
> > > > > threads
> > > > > > > are
> > > > > > > > > > > > saturated
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > >    there are cores available to allocate, even
> > > though
> > > > > the
> > > > > > > > > number
> > > > > > > > > > or
> > > > > > > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If we allocated treated I/O threads as a single
> > > unit
> > > > of
> > > > > > > 100%,
> > > > > > > > > all
> > > > > > > > > > > > user
> > > > > > > > > > > > > > > quotas need to be reallocated for 1). If we
> > > allocated
> > > > > I/O
> > > > > > > > > threads
> > > > > > > > > > > as
> > > > > > > > > > > > n
> > > > > > > > > > > > > > > units with n*100%, all user quotas need to be
> > > > > reallocated
> > > > > > > for
> > > > > > > > > 2),
> > > > > > > > > > > > > > otherwise
> > > > > > > > > > > > > > > some of the new threads may just not be used.
> > > Either
> > > > > way
> > > > > > it
> > > > > > > > > > should
> > > > > > > > > > > be
> > > > > > > > > > > > > > easy
> > > > > > > > > > > > > > > to write a script to decrease/increase quotas
> by
> > a
> > > > > > multiple
> > > > > > > > for
> > > > > > > > > > all
> > > > > > > > > > > > > > users.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > So it really boils down to which quota unit is
> > most
> > > > > > > intuitive
> > > > > > > > > in
> > > > > > > > > > > > terms
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > configuration. And from the discussion so far,
> it
> > > > feels
> > > > > > > like
> > > > > > > > > > > opinion
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > divided on whether quotas should be carved out
> of
> > > an
> > > > > > > absolute
> > > > > > > > > > 100%
> > > > > > > > > > > > (or
> > > > > > > > > > > > > 1
> > > > > > > > > > > > > > > unit) or be relative to the number of threads
> > > (n*100%
> > > > > or
> > > > > > n
> > > > > > > > > > units).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <
> > > > > > jun@confluent.io>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Another way to express an absolute limit is
> to
> > > use
> > > > > > > > > > > > > request.percentage,
> > > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > treat it as the percentage used in a single
> > > request
> > > > > > > > handling
> > > > > > > > > > > > thread.
> > > > > > > > > > > > > > For
> > > > > > > > > > > > > > > > now, the request handling threads can be just
> > the
> > > > io
> > > > > > > > threads.
> > > > > > > > > > In
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > future, they can cover the network threads as
> > > well.
> > > > > > This
> > > > > > > is
> > > > > > > > > > > similar
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > top reports CPU usage and may be a bit easier
> > for
> > > > > > people
> > > > > > > to
> > > > > > > > > > > > > understand.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <
> > > > > > > > jun@confluent.io>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2. Regarding request.unit vs
> > > request.percentage.
> > > > I
> > > > > > > > started
> > > > > > > > > > with
> > > > > > > > > > > > > > > > > request.percentage too. The reasoning for
> > > > > > request.unit
> > > > > > > is
> > > > > > > > > the
> > > > > > > > > > > > > > > following.
> > > > > > > > > > > > > > > > > Suppose that the capacity has been reached
> > on a
> > > > > > broker
> > > > > > > > and
> > > > > > > > > > the
> > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > > to add a new user. A simple way to increase
> > the
> > > > > > > capacity
> > > > > > > > is
> > > > > > > > > > to
> > > > > > > > > > > > > > increase
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > number of io threads, assuming there are
> > still
> > > > > enough
> > > > > > > > > cores.
> > > > > > > > > > If
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > > is based on percentage, the additional
> > capacity
> > > > > > > > > automatically
> > > > > > > > > > > > gets
> > > > > > > > > > > > > > > > > distributed to existing users and we
> haven't
> > > > really
> > > > > > > > carved
> > > > > > > > > > out
> > > > > > > > > > > > any
> > > > > > > > > > > > > > > > > additional resource for the new user. Now,
> is
> > > it
> > > > > easy
> > > > > > > > for a
> > > > > > > > > > > user
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > reason
> > > > > > > > > > > > > > > > > about 0.1 unit vs 10%. My feeling is that
> > both
> > > > are
> > > > > > hard
> > > > > > > > and
> > > > > > > > > > > have
> > > > > > > > > > > > to
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > > configured empirically. Not sure if
> > percentage
> > > is
> > > > > > > > obviously
> > > > > > > > > > > > easier
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > reason about.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps
> <
> > > > > > > > > jay@confluent.io
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> 1. Even though the implementation of this
> > > quota
> > > > is
> > > > > > > only
> > > > > > > > > > using
> > > > > > > > > > > io
> > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > >> time, i think we should call it something
> > like
> > > > > > > > > > "request-time".
> > > > > > > > > > > > > This
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > >> give us flexibility to improve the
> > > > implementation
> > > > > to
> > > > > > > > cover
> > > > > > > > > > > > network
> > > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > >> in the future and will avoid exposing
> > internal
> > > > > > details
> > > > > > > > > like
> > > > > > > > > > > our
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> 2. Jun/Roger, I get what you are trying to
> > fix
> > > > but
> > > > > > the
> > > > > > > > > idea
> > > > > > > > > > of
> > > > > > > > > > > > > > > > >> thread/units
> > > > > > > > > > > > > > > > >> is super unintuitive as a user-facing
> knob.
> > I
> > > > had
> > > > > to
> > > > > > > > read
> > > > > > > > > > the
> > > > > > > > > > > > KIP
> > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > >> eight times to understand this. I'm not
> sure
> > > > that
> > > > > > your
> > > > > > > > > point
> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > >> increasing the number of threads is a
> > problem
> > > > > with a
> > > > > > > > > > > > > > percentage-based
> > > > > > > > > > > > > > > > >> value, it really depends on whether the
> user
> > > > > thinks
> > > > > > > > about
> > > > > > > > > > the
> > > > > > > > > > > > > > > > "percentage
> > > > > > > > > > > > > > > > >> of request processing time" or "thread
> > units".
> > > > If
> > > > > > they
> > > > > > > > > think
> > > > > > > > > > > "I
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > >> allocated 10% of my request processing
> time
> > to
> > > > > user
> > > > > > x"
> > > > > > > > > then
> > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > bug
> > > > > > > > > > > > > > > > >> that increasing the thread count decreases
> > > that
> > > > > > > percent
> > > > > > > > as
> > > > > > > > > > it
> > > > > > > > > > > > does
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> current proposal. As a practical matter I
> > > think
> > > > > the
> > > > > > > only
> > > > > > > > > way
> > > > > > > > > > > to
> > > > > > > > > > > > > > > actually
> > > > > > > > > > > > > > > > >> reason about this is as a percent---I just
> > > don't
> > > > > > > believe
> > > > > > > > > > > people
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > going
> > > > > > > > > > > > > > > > >> to think, "ah, 4.3 thread units, that is
> the
> > > > right
> > > > > > > > > amount!".
> > > > > > > > > > > > > > Instead I
> > > > > > > > > > > > > > > > >> think they have to understand this thread
> > unit
> > > > > > > concept,
> > > > > > > > > > figure
> > > > > > > > > > > > out
> > > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > >> they have set in number of threads,
> compute
> > a
> > > > > > percent
> > > > > > > > and
> > > > > > > > > > then
> > > > > > > > > > > > > come
> > > > > > > > > > > > > > up
> > > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > > >> the number of thread units, and these will
> > all
> > > > be
> > > > > > > wrong
> > > > > > > > if
> > > > > > > > > > > that
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > >> count changes. I also think this ties us
> to
> > > > > > throttling
> > > > > > > > the
> > > > > > > > > > I/O
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > >> pool,
> > > > > > > > > > > > > > > > >> which may not be where we want to end up.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> 3. For what it's worth I do think having a
> > > > single
> > > > > > > > > > throttle_ms
> > > > > > > > > > > > > field
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > > >> the responses that combines all throttling
> > > from
> > > > > all
> > > > > > > > quotas
> > > > > > > > > > is
> > > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> simplest. There could be a use case for
> > having
> > > > > > > separate
> > > > > > > > > > fields
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > each,
> > > > > > > > > > > > > > > > >> but I think that is actually harder to
> > > > use/monitor
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > common
> > > > > > > > > > > > > > case
> > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > >> unless someone has a use case I think just
> > one
> > > > > > should
> > > > > > > be
> > > > > > > > > > fine.
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini
> > > Sivaram
> > > > <
> > > > > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > >> > I have updated the KIP based on the
> > > > discussions
> > > > > so
> > > > > > > > far.
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> > Regards,
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> > Rajini
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini
> > > > > Sivaram <
> > > > > > > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > Ismael #1. It makes sense not to
> > throttle
> > > > > > > > inter-broker
> > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > >> > > LeaderAndIsr etc. The simplest way to
> > > ensure
> > > > > > that
> > > > > > > > > > clients
> > > > > > > > > > > > > cannot
> > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > >> > these
> > > > > > > > > > > > > > > > >> > > requests to bypass quotas for DoS
> > attacks
> > > is
> > > > > to
> > > > > > > > ensure
> > > > > > > > > > > that
> > > > > > > > > > > > > ACLs
> > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > >> > > clients from using these requests and
> > > > > > unauthorized
> > > > > > > > > > > requests
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > >> included
> > > > > > > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking
> that
> > > > these
> > > > > > > quotas
> > > > > > > > > can
> > > > > > > > > > > > > return
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > >> > separate
> > > > > > > > > > > > > > > > >> > > throttle time, and all utilization
> based
> > > > > quotas
> > > > > > > > could
> > > > > > > > > > use
> > > > > > > > > > > > the
> > > > > > > > > > > > > > same
> > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > >> > > (we won't add another one for network
> > > thread
> > > > > > > > > utilization
> > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> instance).
> > > > > > > > > > > > > > > > >> > But
> > > > > > > > > > > > > > > > >> > > perhaps it makes sense to keep byte
> rate
> > > > > quotas
> > > > > > > > > separate
> > > > > > > > > > > in
> > > > > > > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > > > > > > >> > > responses to provide separate metrics?
> > > Agree
> > > > > > with
> > > > > > > > > Ismael
> > > > > > > > > > > > that
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> name of
> > > > > > > > > > > > > > > > >> > > the existing field should be changed
> if
> > we
> > > > > have
> > > > > > > two.
> > > > > > > > > > Happy
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > switch
> > > > > > > > > > > > > > > > >> to a
> > > > > > > > > > > > > > > > >> > > single combined throttle time if that
> is
> > > > > > > sufficient.
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP.
> Will
> > > use
> > > > > dot
> > > > > > > > > > separated
> > > > > > > > > > > > > name
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> new
> > > > > > > > > > > > > > > > >> > > property. Replication quotas use dot
> > > > > separated,
> > > > > > so
> > > > > > > > it
> > > > > > > > > > will
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > >> consistent
> > > > > > > > > > > > > > > > >> > > with all properties except byte rate
> > > quotas.
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > Radai: #1 Request processing time
> rather
> > > > than
> > > > > > > > request
> > > > > > > > > > rate
> > > > > > > > > > > > > were
> > > > > > > > > > > > > > > > chosen
> > > > > > > > > > > > > > > > >> > > because the time per request can vary
> > > > > > > significantly
> > > > > > > > > > > between
> > > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > > > > > > > > > >> > > #2 Two separate quotas for
> > > > heartbeats/regular
> > > > > > > > requests
> > > > > > > > > > > feel
> > > > > > > > > > > > > like
> > > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > >> > > configuration and more metrics. Since
> > most
> > > > > users
> > > > > > > > would
> > > > > > > > > > set
> > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > >> higher
> > > > > > > > > > > > > > > > >> > > than the expected usage and quotas are
> > > more
> > > > > of a
> > > > > > > > > safety
> > > > > > > > > > > > net, a
> > > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > > > > > > >> > >  #3 The number of requests in
> purgatory
> > is
> > > > > > limited
> > > > > > > > by
> > > > > > > > > > the
> > > > > > > > > > > > > number
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > >> > active
> > > > > > > > > > > > > > > > >> > > connections since only one request per
> > > > > > connection
> > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > > > > > throttled
> > > > > > > > > > > > > > > > >> at a
> > > > > > > > > > > > > > > > >> > > time.
> > > > > > > > > > > > > > > > >> > > #4 As with byte rate quotas, to use
> the
> > > full
> > > > > > > > allocated
> > > > > > > > > > > > quotas,
> > > > > > > > > > > > > > > > >> > > clients/users would need to use
> > partitions
> > > > > that
> > > > > > > are
> > > > > > > > > > > > > distributed
> > > > > > > > > > > > > > > > across
> > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > >> > > cluster. The alternative of using
> > > > cluster-wide
> > > > > > > > quotas
> > > > > > > > > > > > instead
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > >> > per-broker
> > > > > > > > > > > > > > > > >> > > quotas would be far too complex to
> > > > implement.
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > Dong : We currently have two
> > > > > ClientQuotaManagers
> > > > > > > for
> > > > > > > > > > quota
> > > > > > > > > > > > > types
> > > > > > > > > > > > > > > > Fetch
> > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > >> > > Produce. A new one will be added for
> > > > IOThread,
> > > > > > > which
> > > > > > > > > > > manages
> > > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > >> > I/O
> > > > > > > > > > > > > > > > >> > > thread utilization. This will not
> update
> > > the
> > > > > > Fetch
> > > > > > > > or
> > > > > > > > > > > > Produce
> > > > > > > > > > > > > > > > >> queue-size,
> > > > > > > > > > > > > > > > >> > > but will have a separate metric for
> the
> > > > > > > > queue-size.  I
> > > > > > > > > > > > wasn't
> > > > > > > > > > > > > > > > >> planning to
> > > > > > > > > > > > > > > > >> > > add any additional metrics apart from
> > the
> > > > > > > equivalent
> > > > > > > > > > ones
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > >> > > quotas as part of this KIP. Ratio of
> > > > byte-rate
> > > > > > to
> > > > > > > > I/O
> > > > > > > > > > > thread
> > > > > > > > > > > > > > > > >> utilization
> > > > > > > > > > > > > > > > >> > > could be slightly misleading since it
> > > > depends
> > > > > on
> > > > > > > the
> > > > > > > > > > > > sequence
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > >> > requests.
> > > > > > > > > > > > > > > > >> > > But we can look into more metrics
> after
> > > the
> > > > > KIP
> > > > > > is
> > > > > > > > > > > > implemented
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > >> > required.
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > I think we need to limit the maximum
> > delay
> > > > > since
> > > > > > > all
> > > > > > > > > > > > requests
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > >> > > throttled. If a client has a quota of
> > > 0.001
> > > > > > units
> > > > > > > > and
> > > > > > > > > a
> > > > > > > > > > > > single
> > > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > >> > used
> > > > > > > > > > > > > > > > >> > > 50ms, we don't want to delay all
> > requests
> > > > from
> > > > > > the
> > > > > > > > > > client
> > > > > > > > > > > by
> > > > > > > > > > > > > 50
> > > > > > > > > > > > > > > > >> seconds,
> > > > > > > > > > > > > > > > >> > > throwing the client out of all its
> > > consumer
> > > > > > > groups.
> > > > > > > > > The
> > > > > > > > > > > > issue
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > > >> if
> > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > >> > > user is allocated a quota that is
> > > > insufficient
> > > > > > to
> > > > > > > > > > process
> > > > > > > > > > > > one
> > > > > > > > > > > > > > > large
> > > > > > > > > > > > > > > > >> > > request. The expectation is that the
> > units
> > > > > > > allocated
> > > > > > > > > per
> > > > > > > > > > > > user
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > >> > much
> > > > > > > > > > > > > > > > >> > > higher than the time taken to process
> > one
> > > > > > request
> > > > > > > > and
> > > > > > > > > > the
> > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > > >> > > seldom be applied. Agree this needs
> > proper
> > > > > > > > > > documentation.
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM,
> radai <
> > > > > > > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > > >> > >> @jun: i wasnt concerned about tying
> up
> > a
> > > > > > request
> > > > > > > > > > > processing
> > > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > > >> but
> > > > > > > > > > > > > > > > >> > >> IIUC the code does still read the
> > entire
> > > > > > request
> > > > > > > > out,
> > > > > > > > > > > which
> > > > > > > > > > > > > > might
> > > > > > > > > > > > > > > > >> add-up
> > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM,
> Dong
> > > Lin
> > > > <
> > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > The current KIP says that the
> maximum
> > > > delay
> > > > > > > will
> > > > > > > > be
> > > > > > > > > > > > reduced
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> window
> > > > > > > > > > > > > > > > >> > >> size
> > > > > > > > > > > > > > > > >> > >> > if it is larger than the window
> > size. I
> > > > > have
> > > > > > a
> > > > > > > > > > concern
> > > > > > > > > > > > with
> > > > > > > > > > > > > > > this:
> > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > 1) This essentially means that the
> > user
> > > > is
> > > > > > > > allowed
> > > > > > > > > to
> > > > > > > > > > > > > exceed
> > > > > > > > > > > > > > > > their
> > > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > > >> > >> > over a long period of time. Can you
> > > > provide
> > > > > > an
> > > > > > > > > upper
> > > > > > > > > > > > bound
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > 2) What is the motivation for cap
> the
> > > > > maximum
> > > > > > > > delay
> > > > > > > > > > by
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > window
> > > > > > > > > > > > > > > > >> > size?
> > > > > > > > > > > > > > > > >> > >> I
> > > > > > > > > > > > > > > > >> > >> > am wondering if there is better
> > > > alternative
> > > > > > to
> > > > > > > > > > address
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > 3) It means that the existing
> > > > > metric-related
> > > > > > > > config
> > > > > > > > > > > will
> > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > >> > >> > directly impact on the mechanism of
> > > this
> > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > >> quota.
> > > > > > > > > > > > > > > > >> > The
> > > > > > > > > > > > > > > > >> > >> > may be an important change
> depending
> > on
> > > > the
> > > > > > > > answer
> > > > > > > > > to
> > > > > > > > > > > 1)
> > > > > > > > > > > > > > above.
> > > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > > >> > >> probably
> > > > > > > > > > > > > > > > >> > >> > need to document this more
> > explicitly.
> > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM,
> > Dong
> > > > Lin
> > > > > <
> > > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > > Yeah you are right. I thought it
> > > wasn't
> > > > > > > because
> > > > > > > > > at
> > > > > > > > > > > > > LinkedIn
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > >> will
> > > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > >> > >> > > much pressure on inGraph to
> expose
> > > > those
> > > > > > > > > > per-clientId
> > > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > > >> > >> ended
> > > > > > > > > > > > > > > > >> > >> > > up printing them periodically to
> > > local
> > > > > log.
> > > > > > > > Never
> > > > > > > > > > > mind
> > > > > > > > > > > > if
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> not
> > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > > - I agree with Jay that we
> probably
> > > > don't
> > > > > > > want
> > > > > > > > to
> > > > > > > > > > > add a
> > > > > > > > > > > > > new
> > > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > > > > >> > >> > > every quota ProduceResponse or
> > > > > > FetchResponse.
> > > > > > > > Is
> > > > > > > > > > > there
> > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > >> use-case
> > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > >> > >> > > having separate throttle-time
> > fields
> > > > for
> > > > > > > > > > > > byte-rate-quota
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-quota? You
> probably
> > > need
> > > > > to
> > > > > > > > > document
> > > > > > > > > > > > this
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > >> > interface
> > > > > > > > > > > > > > > > >> > >> > > change if you plan to add new
> field
> > > in
> > > > > any
> > > > > > > > > request.
> > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > > - I don't think IOThread belongs
> to
> > > > > > > quotaType.
> > > > > > > > > The
> > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > >> > >> types
> > > > > > > > > > > > > > > > >> > >> > > (i.e.
> > Produce/Fetch/LeaderReplicatio
> > > > > > > > > > > > > n/FollowerReplication)
> > > > > > > > > > > > > > > > >> identify
> > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > >> > >> > > type of request that are
> throttled,
> > > not
> > > > > the
> > > > > > > > quota
> > > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > > - If a request is throttled due
> to
> > > this
> > > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > > >> quota,
> > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > > existing queue-size metric in
> > > > > > > > ClientQuotaManager
> > > > > > > > > > > > > > incremented?
> > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > > - In the interest of providing
> > guide
> > > > line
> > > > > > for
> > > > > > > > > admin
> > > > > > > > > > > to
> > > > > > > > > > > > > > decide
> > > > > > > > > > > > > > > > >> > >> > > io-thread-unit-based quota and
> for
> > > user
> > > > > to
> > > > > > > > > > understand
> > > > > > > > > > > > its
> > > > > > > > > > > > > > > > impact
> > > > > > > > > > > > > > > > >> on
> > > > > > > > > > > > > > > > >> > >> their
> > > > > > > > > > > > > > > > >> > >> > > traffic, would it be useful to
> > have a
> > > > > > metric
> > > > > > > > that
> > > > > > > > > > > shows
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> overall
> > > > > > > > > > > > > > > > >> > >> > > byte-rate per io-thread-unit? Can
> > we
> > > > also
> > > > > > > show
> > > > > > > > > > this a
> > > > > > > > > > > > > > > > >> per-clientId
> > > > > > > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM,
> > Jun
> > > > Rao
> > > > > <
> > > > > > > > > > > > > jun@confluent.io
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > >> > >> > >> For #3, typically, an admin
> won't
> > > > > > configure
> > > > > > > > more
> > > > > > > > > > io
> > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > than
> > > > > > > > > > > > > > > > >> > CPU
> > > > > > > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > > > > > > >> > >> > >> but it's possible for an admin
> to
> > > > start
> > > > > > with
> > > > > > > > > fewer
> > > > > > > > > > > io
> > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > > > >> > >> cores
> > > > > > > > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > >> > >> > >> I think the throttleTime sensor
> on
> > > the
> > > > > > > broker
> > > > > > > > > > tells
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > > > > > > >> > >> > >> user/clentId is throttled or
> not.
> > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > >> > >> > >> The reasoning for delaying the
> > > > throttled
> > > > > > > > > requests
> > > > > > > > > > on
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > > >> > >> instead
> > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > >> > >> > >> returning an error immediately
> is
> > > that
> > > > > the
> > > > > > > > > latter
> > > > > > > > > > > has
> > > > > > > > > > > > no
> > > > > > > > > > > > > > way
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > >> > >> > >> client from retrying
> immediately,
> > > > which
> > > > > > will
> > > > > > > > > make
> > > > > > > > > > > > things
> > > > > > > > > > > > > > > > worse.
> > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > >> > >> > >> delaying logic is based off a
> > delay
> > > > > > queue. A
> > > > > > > > > > > separate
> > > > > > > > > > > > > > > > expiration
> > > > > > > > > > > > > > > > >> > >> thread
> > > > > > > > > > > > > > > > >> > >> > >> just waits on the next to be
> > expired
> > > > > > > request.
> > > > > > > > > So,
> > > > > > > > > > it
> > > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > tie
> > > > > > > > > > > > > > > > >> > up a
> > > > > > > > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM,
> > > > Ismael
> > > > > > > Juma <
> > > > > > > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > >> > Regarding 1, I definitely like
> > the
> > > > > > > > simplicity
> > > > > > > > > of
> > > > > > > > > > > > > > keeping a
> > > > > > > > > > > > > > > > >> single
> > > > > > > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > > > > > > >> > >> > >> > time field in the response.
> The
> > > > > downside
> > > > > > > is
> > > > > > > > > that
> > > > > > > > > > > the
> > > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > > >> > metrics
> > > > > > > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > > > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > > > > > > percentage`
> > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.
> > > ratio`.
> > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43
> PM,
> > > Jay
> > > > > > > Kreps <
> > > > > > > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > >> > >    1. Isn't it the case that
> > the
> > > > > > > > throttling
> > > > > > > > > > time
> > > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > > >> > >> > should
> > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > >> > >> > >> > >    the total time your
> request
> > > was
> > > > > > > > throttled
> > > > > > > > > > > > > > > irrespective
> > > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > > > > > > >> > >> > >> > >    caused that. Limiting it
> to
> > > > byte
> > > > > > rate
> > > > > > > > > quota
> > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > >> > >> > >> > >    I don't think we want to
> > end
> > > up
> > > > > > > adding
> > > > > > > > > new
> > > > > > > > > > > > fields
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> response
> > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > > > > > > >> > >> > >> > >    single thing we quota,
> > right?
> > > > > > > > > > > > > > > > >> > >> > >> > >    2. I don't think we
> should
> > > make
> > > > > > this
> > > > > > > > > quota
> > > > > > > > > > > > > > > specifically
> > > > > > > > > > > > > > > > >> > about
> > > > > > > > > > > > > > > > >> > >> io
> > > > > > > > > > > > > > > > >> > >> > >> > >    threads. Once we
> introduce
> > > > these
> > > > > > > quotas
> > > > > > > > > > > people
> > > > > > > > > > > > > set
> > > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > >> > >> > >> > >    be enforced (and if they
> > > aren't
> > > > > it
> > > > > > > may
> > > > > > > > > > cause
> > > > > > > > > > > an
> > > > > > > > > > > > > > > > outage).
> > > > > > > > > > > > > > > > >> As
> > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > > > > > > >> > >> > >> > >    are a bit more sensitive
> > than
> > > > > > normal
> > > > > > > > > > > configs, I
> > > > > > > > > > > > > > > think.
> > > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > > >> > >> > current
> > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > >> > >> > >> > >    pools seem like something
> > of
> > > an
> > > > > > > > > > > implementation
> > > > > > > > > > > > > > detail
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > >> > >> > >> > >    user-facing quotas should
> > be
> > > > > > involved
> > > > > > > > > > with. I
> > > > > > > > > > > > > think
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > >> might
> > > > > > > > > > > > > > > > >> > >> be
> > > > > > > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > >> > >> > >> > >    make this a general
> > > > request-time
> > > > > > > > throttle
> > > > > > > > > > > with
> > > > > > > > > > > > no
> > > > > > > > > > > > > > > > >> mention in
> > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > > > > > > >> > >> > >> > >    about I/O threads and
> > simply
> > > > > > > > acknowledge
> > > > > > > > > > the
> > > > > > > > > > > > > > current
> > > > > > > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > > > > > > >> > >> > >> > >    may someday fix) in the
> > docs
> > > > that
> > > > > > > this
> > > > > > > > > > covers
> > > > > > > > > > > > > only
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > >> > >> after
> > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > >> > >> > >> > >    thread is read off the
> > > network.
> > > > > > > > > > > > > > > > >> > >> > >> > >    3. As such I think the
> > right
> > > > > > > interface
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > > > > > > >> > >> > >> > >    like percent_request_time
> > and
> > > > be
> > > > > in
> > > > > > > > > > > {0,...100}
> > > > > > > > > > > > or
> > > > > > > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think
> > > > "ratio"
> > > > > > is
> > > > > > > > the
> > > > > > > > > > > > > > terminology
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > >> used
> > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > > > > > > >> > >> > >> > >    is between 0 and 1 in the
> > > other
> > > > > > > > metrics,
> > > > > > > > > > > > right?)
> > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45
> > AM,
> > > > > > Rajini
> > > > > > > > > > Sivaram
> > > > > > > > > > > <
> > > > > > > > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > Thank you for the
> feedback.
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang : I have updated
> > the
> > > > > > section
> > > > > > > on
> > > > > > > > > > > > > > co-existence
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > >> byte
> > > > > > > > > > > > > > > > >> > >> rate
> > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added much
> > > detail
> > > > > to
> > > > > > > the
> > > > > > > > > > > metrics
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> sensors
> > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > > > > > > >> > >> > >> > > > going to be very similar
> to
> > > the
> > > > > > > existing
> > > > > > > > > > > metrics
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> sensors.
> > > > > > > > > > > > > > > > >> > >> To
> > > > > > > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > > > > > > >> > >> > >> > > > confusion, I have now
> added
> > > more
> > > > > > > detail.
> > > > > > > > > All
> > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > > > > > > >> > >> > >> > > > "quotaType" and all
> sensors
> > > have
> > > > > > names
> > > > > > > > > > > starting
> > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > > > > > > >> > >> > >> > > > quotaType is
> Produce/Fetch/
> > > > > > > > > > LeaderReplication/
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > FollowerReplication/*IOThread*
> > > > ).
> > > > > > > > > > > > > > > > >> > >> > >> > > > So there will be no reuse
> of
> > > > > > existing
> > > > > > > > > > > > > > metrics/sensors.
> > > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > >> > new
> > > > > > > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > > >> > >> > >> > > > request processing time
> > based
> > > > > > > throttling
> > > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > > > > > >> completely
> > > > > > > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > existing metrics/sensors,
> > but
> > > > will
> > > > > > be
> > > > > > > > > > > consistent
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > >> format.
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > The existing
> > throttle_time_ms
> > > > > field
> > > > > > in
> > > > > > > > > > > > > produce/fetch
> > > > > > > > > > > > > > > > >> > responses
> > > > > > > > > > > > > > > > >> > >> > will
> > > > > > > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > >> > >> > >> > > > impacted by this KIP. That
> > > will
> > > > > > > continue
> > > > > > > > > to
> > > > > > > > > > > > return
> > > > > > > > > > > > > > > > >> byte-rate
> > > > > > > > > > > > > > > > >> > >> based
> > > > > > > > > > > > > > > > >> > >> > >> > > > throttling times. In
> > > addition, a
> > > > > new
> > > > > > > > field
> > > > > > > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > >> > >> > >> > > > added to return request
> > quota
> > > > > based
> > > > > > > > > > throttling
> > > > > > > > > > > > > > times.
> > > > > > > > > > > > > > > > >> These
> > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > > > > > > >> > >> > >> > > > as new metrics on the
> > > > client-side.
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > Since all metrics and
> > sensors
> > > > are
> > > > > > > > > different
> > > > > > > > > > > for
> > > > > > > > > > > > > each
> > > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > > > > > > >> > >> > >> > > > believe there is already
> > > > > sufficient
> > > > > > > > > metrics
> > > > > > > > > > to
> > > > > > > > > > > > > > monitor
> > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > > > > > > >> > >> > >> > > > client and broker side for
> > > each
> > > > > type
> > > > > > > of
> > > > > > > > > > > > > throttling.
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at
> 4:32
> > > AM,
> > > > > > Dong
> > > > > > > > Lin
> > > > > > > > > <
> > > > > > > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > I think it makes a lot
> of
> > > > sense
> > > > > to
> > > > > > > use
> > > > > > > > > > > > > > > io_thread_units
> > > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > > >> > >> metric
> > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > >> > >> > >> > > > > user's traffic here.
> LGTM
> > > > > > overall. I
> > > > > > > > > have
> > > > > > > > > > > some
> > > > > > > > > > > > > > > > questions
> > > > > > > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > - Can you be more
> specific
> > > in
> > > > > the
> > > > > > > KIP
> > > > > > > > > what
> > > > > > > > > > > > > sensors
> > > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > >> > >> > >> > > > > example, it will be
> useful
> > > to
> > > > > > > specify
> > > > > > > > > the
> > > > > > > > > > > name
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > - We currently have
> > > > > throttle-time
> > > > > > > and
> > > > > > > > > > > > queue-size
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > Are you going to have
> > > separate
> > > > > > > > > > throttle-time
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> queue-size
> > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > > > > > > >> > >> > >> > > > > throttled by
> > > > > io_thread_unit-based
> > > > > > > > quota,
> > > > > > > > > > or
> > > > > > > > > > > > will
> > > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > > >> share
> > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > - Does the throttle-time
> > in
> > > > the
> > > > > > > > > > > > ProduceResponse
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > > > > > > >> > >> > >> > > > > time due to
> > > > io_thread_unit-based
> > > > > > > > quota?
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > - Currently kafka server
> > > > doesn't
> > > > > > not
> > > > > > > > > > provide
> > > > > > > > > > > > any
> > > > > > > > > > > > > > log
> > > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > > > > > > >> > >> > >> > > > > whether any given
> clientId
> > > (or
> > > > > > user)
> > > > > > > > is
> > > > > > > > > > > > > throttled.
> > > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > > > > > > >> > >> > >> > > > > because we can still
> check
> > > the
> > > > > > > > > client-side
> > > > > > > > > > > > > > byte-rate
> > > > > > > > > > > > > > > > >> metric
> > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > > > > > > >> > >> > >> > > > > whether a given client
> is
> > > > > > throttled.
> > > > > > > > But
> > > > > > > > > > > with
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > > > > > > >> > >> > >> > > > > will be no way to
> validate
> > > > > > whether a
> > > > > > > > > given
> > > > > > > > > > > > > client
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> slow
> > > > > > > > > > > > > > > > >> > >> > because
> > > > > > > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > > > > > > >> > >> > >> > > > > exceeded its
> > io_thread_unit
> > > > > limit.
> > > > > > > It
> > > > > > > > is
> > > > > > > > > > > > > necessary
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> user
> > > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > know this information to
> > > > figure
> > > > > > how
> > > > > > > > > > whether
> > > > > > > > > > > > they
> > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > >> > reached
> > > > > > > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > >> > >> > >> > > > > limit. How about we add
> > > log4j
> > > > > log
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > > server
> > > > > > > > > > > > > > side
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > > > > > > > byte-rate-throttle-time,
> > > > > > > > > > > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > > > > > > >> > >> > >> > > > > that kafka administrator
> > can
> > > > > > figure
> > > > > > > > > those
> > > > > > > > > > > > users
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > >> have
> > > > > > > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > > > > > > >> > >> > >> > > > > limit and act
> accordingly?
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at
> > 4:46
> > > > PM,
> > > > > > > > > Guozhang
> > > > > > > > > > > > Wang <
> > > > > > > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > Made a pass over the
> > doc,
> > > > > > overall
> > > > > > > > LGTM
> > > > > > > > > > > > except
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > minor
> > > > > > > > > > > > > > > > >> > >> comment
> > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttling
> > implementation:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > Stated as "Request
> > > > processing
> > > > > > time
> > > > > > > > > > > > throttling
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > necessary." I thought
> > that
> > > > it
> > > > > > > meant
> > > > > > > > > the
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > >> > processing
> > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > is applied first, but
> > > > continue
> > > > > > > > > reading I
> > > > > > > > > > > > found
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > >> > actually
> > > > > > > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > produce / fetch byte
> > rate
> > > > > > > throttling
> > > > > > > > > > > first.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > Also the last sentence
> > > "The
> > > > > > > > remaining
> > > > > > > > > > > delay
> > > > > > > > > > > > if
> > > > > > > > > > > > > > any
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > response." is a bit
> > > > confusing
> > > > > to
> > > > > > > me.
> > > > > > > > > > Maybe
> > > > > > > > > > > > > > > rewording
> > > > > > > > > > > > > > > > >> it a
> > > > > > > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017
> at
> > > 3:24
> > > > > PM,
> > > > > > > Jun
> > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Thanks for the
> updated
> > > > KIP.
> > > > > > The
> > > > > > > > > latest
> > > > > > > > > > > > > > proposal
> > > > > > > > > > > > > > > > >> looks
> > > > > > > > > > > > > > > > >> > >> good
> > > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017
> > at
> > > > 2:19
> > > > > > PM,
> > > > > > > > > > Rajini
> > > > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Thank you for the
> > > > > feedback.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 1. I have updated
> > the
> > > > KIP
> > > > > to
> > > > > > > use
> > > > > > > > > > > > absolute
> > > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property is
> called*
> > > > > > > > > io_thread_units*
> > > > > > > > > > > to
> > > > > > > > > > > > > > align
> > > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> > > > *num.io.threads*.
> > > > > > > When
> > > > > > > > we
> > > > > > > > > > > > > implement
> > > > > > > > > > > > > > > > >> network
> > > > > > > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas, we can add
> > > > another
> > > > > > > > > property
> > > > > > > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 2.
> > ControlledShutdown
> > > is
> > > > > > > already
> > > > > > > > > > > listed
> > > > > > > > > > > > > > under
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > you mean a
> different
> > > > > request
> > > > > > > > that
> > > > > > > > > > > needs
> > > > > > > > > > > > to
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > >> added?
> > > > > > > > > > > > > > > > >> > >> The
> > > > > > > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > currently exempt
> in
> > > the
> > > > > KIP
> > > > > > > are
> > > > > > > > > > > > > StopReplica,
> > > > > > > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and
> > > > > > > UpdateMetadata.
> > > > > > > > > > These
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > > >> controlled
> > > > > > > > > > > > > > > > >> > >> > using
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy
> > to
> > > > > > exclude
> > > > > > > > and
> > > > > > > > > > only
> > > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > sure if there are
> > > other
> > > > > > > requests
> > > > > > > > > > used
> > > > > > > > > > > > only
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was thinking
> > the
> > > > > > smallest
> > > > > > > > > > change
> > > > > > > > > > > > > would
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> replace
> > > > > > > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > *requestChannel.sendResponse()
> > > > > > > *
> > > > > > > > > > with
> > > > > > > > > > > a
> > > > > > > > > > > > > > local
> > > > > > > > > > > > > > > > >> method
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > *sendResponseMaybeThrottle()*
> > > > > > > > that
> > > > > > > > > > > does
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > throttling
> > > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > response. If we
> > > throttle
> > > > > > first
> > > > > > > > in
> > > > > > > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > within the method
> > > > handling
> > > > > > the
> > > > > > > > > > request
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > throttling. We can
> > > look
> > > > > into
> > > > > > > > this
> > > > > > > > > > > again
> > > > > > > > > > > > > when
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> PR
> > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22,
> 2017
> > > at
> > > > > 5:55
> > > > > > > PM,
> > > > > > > > > > Roger
> > > > > > > > > > > > > > Hoover
> > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Great to see
> this
> > > KIP
> > > > > and
> > > > > > > the
> > > > > > > > > > > > excellent
> > > > > > > > > > > > > > > > >> discussion.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's
> > > > suggestion
> > > > > > > makes
> > > > > > > > > > sense.
> > > > > > > > > > > > If
> > > > > > > > > > > > > > my
> > > > > > > > > > > > > > > > >> > >> application
> > > > > > > > > > > > > > > > >> > >> > is
> > > > > > > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request handler
> > > unit,
> > > > > then
> > > > > > > > it's
> > > > > > > > > as
> > > > > > > > > > > if
> > > > > > > > > > > > I
> > > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > > >> > Kafka
> > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request handler
> > > thread
> > > > > > > > dedicated
> > > > > > > > > > to
> > > > > > > > > > > > me.
> > > > > > > > > > > > > > > > That's
> > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > least.  That
> > > > allocation
> > > > > > > > doesn't
> > > > > > > > > > > change
> > > > > > > > > > > > > > even
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > >> an
> > > > > > > > > > > > > > > > >> > >> admin
> > > > > > > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > size of the
> > request
> > > > > thread
> > > > > > > > pool
> > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > > > > > broker.
> > > > > > > > > > > > > > > > >> > It's
> > > > > > > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > abstraction that
> > VMs
> > > > and
> > > > > > > > > > containers
> > > > > > > > > > > > get
> > > > > > > > > > > > > > from
> > > > > > > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > While different
> > > client
> > > > > > > access
> > > > > > > > > > > patterns
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > >> > wildly
> > > > > > > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request thread
> > > > resources
> > > > > > per
> > > > > > > > > > > request,
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > given
> > > > > > > > > > > > > > > > >> > >> > application
> > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > have a stable
> > access
> > > > > > pattern
> > > > > > > > and
> > > > > > > > > > can
> > > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > "request thread
> > > units"
> > > > > it
> > > > > > > > needs
> > > > > > > > > to
> > > > > > > > > > > > meet
> > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22,
> > 2017
> > > > at
> > > > > > 8:53
> > > > > > > > AM,
> > > > > > > > > > Jun
> > > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for the
> > > > updated
> > > > > > > KIP.
> > > > > > > > A
> > > > > > > > > > few
> > > > > > > > > > > > more
> > > > > > > > > > > > > > > > >> comments.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern
> of
> > > > > > > > > > > request_time_percent
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > >> it's
> > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > >> > >> > an
> > > > > > > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Let's say you
> > > give a
> > > > > > user
> > > > > > > a
> > > > > > > > > 10%
> > > > > > > > > > > > limit.
> > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > admin
> > > > > > > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> handler
> > > > > threads,
> > > > > > > > that
> > > > > > > > > > user
> > > > > > > > > > > > now
> > > > > > > > > > > > > > > > >> actually
> > > > > > > > > > > > > > > > >> > has
> > > > > > > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > capacity. This
> > may
> > > > > > confuse
> > > > > > > > > > people
> > > > > > > > > > > a
> > > > > > > > > > > > > bit.
> > > > > > > > > > > > > > > So,
> > > > > > > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > based on an
> > > absolute
> > > > > > > request
> > > > > > > > > > > thread
> > > > > > > > > > > > > unit
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > better.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > > > > > > ControlledShutdownRequest
> > > > > > > > > is
> > > > > > > > > > > also
> > > > > > > > > > > > > an
> > > > > > > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be excluded
> from
> > > > > > > throttling.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 3.
> > Implementation
> > > > > wise,
> > > > > > I
> > > > > > > am
> > > > > > > > > > > > wondering
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > time
> throttling
> > > > first
> > > > > in
> > > > > > > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the throttling
> > > logic
> > > > > in
> > > > > > > each
> > > > > > > > > > type
> > > > > > > > > > > of
> > > > > > > > > > > > > > > > request.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb
> 22,
> > > 2017
> > > > > at
> > > > > > > 5:58
> > > > > > > > > AM,
> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank you
> for
> > > the
> > > > > > > review.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I have
> > reverted
> > > to
> > > > > the
> > > > > > > > > > original
> > > > > > > > > > > > KIP
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > utilization.
> > At
> > > > the
> > > > > > > > moment,
> > > > > > > > > it
> > > > > > > > > > > > uses
> > > > > > > > > > > > > > > > >> percentage,
> > > > > > > > > > > > > > > > >> > >> but
> > > > > > > > > > > > > > > > >> > >> > I
> > > > > > > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > a fraction
> > (out
> > > > of 1
> > > > > > > > instead
> > > > > > > > > > of
> > > > > > > > > > > > 100)
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > from this
> > > > discussion
> > > > > > to
> > > > > > > > the
> > > > > > > > > > KIP.
> > > > > > > > > > > > > Also
> > > > > > > > > > > > > > > > added
> > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > address
> > network
> > > > > thread
> > > > > > > > > > > > utilization.
> > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > "request_time_percent"
> > > > > > > > with
> > > > > > > > > > the
> > > > > > > > > > > > > > > > expectation
> > > > > > > > > > > > > > > > >> > that
> > > > > > > > > > > > > > > > >> > >> it
> > > > > > > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit for
> > > network
> > > > > > thread
> > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > users have
> to
> > > set
> > > > > only
> > > > > > > one
> > > > > > > > > > > config
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > > >> > >> > not
> > > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the internal
> > > > > > > distribution
> > > > > > > > of
> > > > > > > > > > the
> > > > > > > > > > > > > work
> > > > > > > > > > > > > > > > >> between
> > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > >> > >> > two
> > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb
> > 22,
> > > > 2017
> > > > > > at
> > > > > > > > > 12:23
> > > > > > > > > > > AM,
> > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > Rao
> > > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi,
> Rajini,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for
> > the
> > > > > > > proposal.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > The
> benefit
> > of
> > > > > using
> > > > > > > the
> > > > > > > > > > > request
> > > > > > > > > > > > > > > > >> processing
> > > > > > > > > > > > > > > > >> > >> time
> > > > > > > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > exactly
> what
> > > > > people
> > > > > > > have
> > > > > > > > > > > said. I
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > >> > >> expand
> > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > following
> > > case.
> > > > > The
> > > > > > > > > producer
> > > > > > > > > > > > > sends a
> > > > > > > > > > > > > > > > >> produce
> > > > > > > > > > > > > > > > >> > >> > request
> > > > > > > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > but
> > compressed
> > > > to
> > > > > > > 100KB
> > > > > > > > > with
> > > > > > > > > > > > gzip.
> > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > broker
> could
> > > > take
> > > > > > > 10-15
> > > > > > > > > > > seconds,
> > > > > > > > > > > > > > > during
> > > > > > > > > > > > > > > > >> which
> > > > > > > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > thread is
> > > > > completely
> > > > > > > > > > blocked.
> > > > > > > > > > > In
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > >> case,
> > > > > > > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> request
> > > rate
> > > > > > quota
> > > > > > > > may
> > > > > > > > > > be
> > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > another
> > case.
> > > A
> > > > > > > consumer
> > > > > > > > > > group
> > > > > > > > > > > > > > starts
> > > > > > > > > > > > > > > > >> with 10
> > > > > > > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > switches
> to
> > 20
> > > > > > > > instances.
> > > > > > > > > > The
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > >> > will
> > > > > > > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > actually
> > load
> > > on
> > > > > the
> > > > > > > > > broker
> > > > > > > > > > > may
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > double
> > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > contains
> > half
> > > of
> > > > > the
> > > > > > > > > > > partitions.
> > > > > > > > > > > > > > > Request
> > > > > > > > > > > > > > > > >> rate
> > > > > > > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure
> in
> > > > this
> > > > > > > case.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > What we
> > really
> > > > > want
> > > > > > is
> > > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > > able
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > of the
> > server
> > > > side
> > > > > > > > > > resources.
> > > > > > > > > > > In
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > >> > >> particular
> > > > > > > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > capacity
> of
> > > the
> > > > > > > request
> > > > > > > > > > > handler
> > > > > > > > > > > > > > > > threads. I
> > > > > > > > > > > > > > > > >> > >> agree
> > > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > intuitive
> > for
> > > > the
> > > > > > > users
> > > > > > > > to
> > > > > > > > > > > > > determine
> > > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > >> > set
> > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > this is
> not
> > > > > > completely
> > > > > > > > new
> > > > > > > > > > and
> > > > > > > > > > > > has
> > > > > > > > > > > > > > > been
> > > > > > > > > > > > > > > > >> done
> > > > > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > already.
> For
> > > > > > example,
> > > > > > > > > Linux
> > > > > > > > > > > > > cgroup (
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > https://access.redhat.com/
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > Resource_Management_Guide/sec-
> > > > > > > > > > > > > > > cpu.html)
> > > > > > > > > > > > > > > > >> has
> > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > cpu.cfs_quota_us,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > which
> > > specifies
> > > > > the
> > > > > > > > total
> > > > > > > > > > > amount
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a
> > > > cgroup
> > > > > > can
> > > > > > > > run
> > > > > > > > > > > > during a
> > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > >> second
> > > > > > > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > model the
> > > > request
> > > > > > > > handler
> > > > > > > > > > > > threads
> > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > > >> > similar
> > > > > > > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > request
> > > handler
> > > > > > thread
> > > > > > > > can
> > > > > > > > > > be
> > > > > > > > > > > 1
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > >> > handler
> > > > > > > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure
> a
> > > > limit
> > > > > on
> > > > > > > how
> > > > > > > > > > many
> > > > > > > > > > > > > units
> > > > > > > > > > > > > > > (say
> > > > > > > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > > > > > > >> > >> a
> > > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Regarding
> > not
> > > > > > > throttling
> > > > > > > > > the
> > > > > > > > > > > > > > internal
> > > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > do that.
> > > > > > > Alternatively,
> > > > > > > > we
> > > > > > > > > > > could
> > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > > let
> > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for the
> > kafka
> > > > user
> > > > > > (it
> > > > > > > > may
> > > > > > > > > > not
> > > > > > > > > > > > be
> > > > > > > > > > > > > > able
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> do
> > > > > > > > > > > > > > > > >> > >> that
> > > > > > > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we
> > > want
> > > > to
> > > > > > be
> > > > > > > > able
> > > > > > > > > > to
> > > > > > > > > > > > > > protect
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool too.
> > The
> > > > > > > difficult
> > > > > > > > is
> > > > > > > > > > > > mostly
> > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > >> Rajini
> > > > > > > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttling
> > the
> > > > > > > requests
> > > > > > > > is
> > > > > > > > > > > > through
> > > > > > > > > > > > > > > > >> Purgatory
> > > > > > > > > > > > > > > > >> > >> and
> > > > > > > > > > > > > > > > >> > >> > we
> > > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > through
> how
> > to
> > > > > > > integrate
> > > > > > > > > > that
> > > > > > > > > > > > into
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > network
> > > > > > > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > layer,
> > > currently
> > > > > we
> > > > > > > know
> > > > > > > > > the
> > > > > > > > > > > > user,
> > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit
> > > > tricky
> > > > > to
> > > > > > > > > > throttle
> > > > > > > > > > > > > based
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > >> > clientId
> > > > > > > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota can
> > > > already
> > > > > > > > protect
> > > > > > > > > > the
> > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > >> thread
> > > > > > > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > requests.
> > So,
> > > if
> > > > > we
> > > > > > > > can't
> > > > > > > > > > > figure
> > > > > > > > > > > > > out
> > > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > >> > part
> > > > > > > > > > > > > > > > >> > >> > right
> > > > > > > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> request
> > > > > handling
> > > > > > > > > threads
> > > > > > > > > > > for
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue,
> Feb
> > > 21,
> > > > > 2017
> > > > > > > at
> > > > > > > > > 4:27
> > > > > > > > > > > AM,
> > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank
> you
> > > all
> > > > > for
> > > > > > > the
> > > > > > > > > > > > feedback.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I
> > have
> > > > > > removed
> > > > > > > > > > > exemption
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> consumer
> > > > > > > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> protecting
> > > the
> > > > > > > cluster
> > > > > > > > > is
> > > > > > > > > > > more
> > > > > > > > > > > > > > > > important
> > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have
> > > retained
> > > > > the
> > > > > > > > > > exemption
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> throttled
> > > only
> > > > > if
> > > > > > > > > > > > authorization
> > > > > > > > > > > > > > > fails
> > > > > > > > > > > > > > > > >> (so
> > > > > > > > > > > > > > > > >> > >> can't
> > > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a secure
> > > > > cluster,
> > > > > > > but
> > > > > > > > > > allows
> > > > > > > > > > > > > > > > >> inter-broker
> > > > > > > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I will
> > wait
> > > > > > another
> > > > > > > > day
> > > > > > > > > to
> > > > > > > > > > > see
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > these
> > > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > > >> > >> any
> > > > > > > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > request
> > > > > processing
> > > > > > > > time
> > > > > > > > > > (as
> > > > > > > > > > > > > > opposed
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> request
> > > > > > > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > objections,
> > > I
> > > > > will
> > > > > > > > > revert
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> original
> > > > > > > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > The
> > original
> > > > > > > proposal
> > > > > > > > > was
> > > > > > > > > > > only
> > > > > > > > > > > > > > > > including
> > > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handler
> > > > threads
> > > > > > > (that
> > > > > > > > > made
> > > > > > > > > > > > > > > calculation
> > > > > > > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > include
> > the
> > > > time
> > > > > > > spent
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > significant.
> > > > As
> > > > > > Jay
> > > > > > > > > > pointed
> > > > > > > > > > > > out,
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > total
> > > > available
> > > > > > CPU
> > > > > > > > time
> > > > > > > > > > and
> > > > > > > > > > > > > > convert
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and *n*
> > > > network
> > > > > > > > threads.
> > > > > > > > > > > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we want,
> > but
> > > > it
> > > > > > can
> > > > > > > be
> > > > > > > > > > very
> > > > > > > > > > > > > > > expensive
> > > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > > >> > some
> > > > > > > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang
> > > have
> > > > > > > pointed
> > > > > > > > > out,
> > > > > > > > > > > we
> > > > > > > > > > > > do
> > > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > > >> > several
> > > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> generating
> > > > > metrics
> > > > > > > > that
> > > > > > > > > we
> > > > > > > > > > > > could
> > > > > > > > > > > > > > > use,
> > > > > > > > > > > > > > > > >> > though
> > > > > > > > > > > > > > > > >> > >> we
> > > > > > > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> nanoTime()
> > > > > instead
> > > > > > > of
> > > > > > > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > small
> > > requests
> > > > > may
> > > > > > > be
> > > > > > > > <
> > > > > > > > > > 1ms.
> > > > > > > > > > > > But
> > > > > > > > > > > > > > > > rather
> > > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > > >> > >> add
> > > > > > > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thread
> and
> > > > > network
> > > > > > > > > thread,
> > > > > > > > > > > > > > wouldn't
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > > >> > >> better
> > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on each
> > > thread
> > > > > > into
> > > > > > > a
> > > > > > > > > > > separate
> > > > > > > > > > > > > > > ratio?
> > > > > > > > > > > > > > > > >> UserA
> > > > > > > > > > > > > > > > >> > >> has
> > > > > > > > > > > > > > > > >> > >> > a
> > > > > > > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we take
> > that
> > > > to
> > > > > > mean
> > > > > > > > > that
> > > > > > > > > > > > UserA
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > > >> 5%
> > > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5%
> of
> > > the
> > > > > time
> > > > > > > on
> > > > > > > > > I/O
> > > > > > > > > > > > > threads?
> > > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > > >> > either
> > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> throttled
> > -
> > > it
> > > > > > would
> > > > > > > > > mean
> > > > > > > > > > > > > > > maintaining
> > > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > > >> > >> sets
> > > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> durations,
> > > but
> > > > > > would
> > > > > > > > > > result
> > > > > > > > > > > in
> > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quota
> > limits
> > > > > > (UserA
> > > > > > > > has
> > > > > > > > > 5%
> > > > > > > > > > > of
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > >> > threads
> > > > > > > > > > > > > > > > >> > >> > and
> > > > > > > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > but that
> > > seems
> > > > > > > > > unnecessary
> > > > > > > > > > > and
> > > > > > > > > > > > > > > harder
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> explain
> > > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back to
> > why
> > > > and
> > > > > > how
> > > > > > > > > quotas
> > > > > > > > > > > are
> > > > > > > > > > > > > > > applied
> > > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > > >> > >> > network
> > > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In
> the
> > > case
> > > > > of
> > > > > > > > fetch,
> > > > > > > > > > > the
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > >> spent in
> > > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > significant
> > > > and
> > > > > I
> > > > > > > can
> > > > > > > > > see
> > > > > > > > > > > the
> > > > > > > > > > > > > need
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > include
> > > > > > > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > requests
> > > where
> > > > > the
> > > > > > > > > network
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of
> fetch,
> > > > > request
> > > > > > > > > handler
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > >> > utilization
> > > > > > > > > > > > > > > > >> > >> > would
> > > > > > > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > high
> > request
> > > > > rate,
> > > > > > > low
> > > > > > > > > > data
> > > > > > > > > > > > > volume
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> > fetch
> > > > > > > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > clients
> > with
> > > > > high
> > > > > > > data
> > > > > > > > > > > volume.
> > > > > > > > > > > > > > > Network
> > > > > > > > > > > > > > > > >> > thread
> > > > > > > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > proportional
> > > > to
> > > > > > the
> > > > > > > > data
> > > > > > > > > > > > > volume. I
> > > > > > > > > > > > > > > am
> > > > > > > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > based on
> > > > network
> > > > > > > > thread
> > > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > >> > >> whether
> > > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > this
> case.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At
> the
> > > > > moment,
> > > > > > we
> > > > > > > > > > record
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > > >> > >> quota
> > > > > > > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a
> quota
> > > is
> > > > > > > > violated,
> > > > > > > > > > the
> > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk
> reads
> > > for
> > > > > > > fetches
> > > > > > > > > > > > happening
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> network
> > > > > > > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delay a
> > > > response
> > > > > > > after
> > > > > > > > > the
> > > > > > > > > > > > disk
> > > > > > > > > > > > > > > reads.
> > > > > > > > > > > > > > > > >> We
> > > > > > > > > > > > > > > > >> > >> could
> > > > > > > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > the
> > network
> > > > > thread
> > > > > > > > when
> > > > > > > > > > the
> > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > >> complete
> > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> handling a
> > > > > > > subsequent
> > > > > > > > > > > request
> > > > > > > > > > > > > > > > (separate
> > > > > > > > > > > > > > > > >> out
> > > > > > > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling
> > in
> > > > the
> > > > > > case
> > > > > > > > of
> > > > > > > > > > > > network
> > > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue,
> > Feb
> > > > 21,
> > > > > > 2017
> > > > > > > > at
> > > > > > > > > > 2:58
> > > > > > > > > > > > AM,
> > > > > > > > > > > > > > > > Becket
> > > > > > > > > > > > > > > > >> > Qin <
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > becket.qin@gmail.com>
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey
> Jay,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah,
> I
> > > > agree
> > > > > > that
> > > > > > > > > > > enforcing
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > > >> > >> is a
> > > > > > > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > that
> > maybe
> > > > we
> > > > > > can
> > > > > > > > use
> > > > > > > > > > the
> > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > > >> > request
> > > > > > > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > very
> > > > detailed
> > > > > so
> > > > > > > we
> > > > > > > > > can
> > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > something
> > > > like
> > > > > > > > > > > (total_time -
> > > > > > > > > > > > > > > > >> > >> > >> > > >
> request/response_queue_time
> > > > > > > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > remote_time).
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I
> agree
> > > with
> > > > > > > > Guozhang
> > > > > > > > > > that
> > > > > > > > > > > > > when
> > > > > > > > > > > > > > a
> > > > > > > > > > > > > > > > >> user is
> > > > > > > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > need
> to
> > > see
> > > > if
> > > > > > > > > anything
> > > > > > > > > > > has
> > > > > > > > > > > > > went
> > > > > > > > > > > > > > > > wrong
> > > > > > > > > > > > > > > > >> > >> first,
> > > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> behaving
> > > and
> > > > > > just
> > > > > > > > need
> > > > > > > > > > > more
> > > > > > > > > > > > > > > > >> resources, we
> > > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for
> > them.
> > > It
> > > > > is
> > > > > > > true
> > > > > > > > > > that
> > > > > > > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > users
> is
> > > > > > > difficult.
> > > > > > > > So
> > > > > > > > > > in
> > > > > > > > > > > > > > practice
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > >> > would
> > > > > > > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a
> > relative
> > > > > high
> > > > > > > > > > protective
> > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > >> quota
> > > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for
> some
> > > > > > > individual
> > > > > > > > > > > clients
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > > demand.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> Thanks,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> Jiangjie
> > > > > > (Becket)
> > > > > > > > Qin
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On
> Mon,
> > > Feb
> > > > > 20,
> > > > > > > 2017
> > > > > > > > > at
> > > > > > > > > > > 5:48
> > > > > > > > > > > > > PM,
> > > > > > > > > > > > > > > > >> Guozhang
> > > > > > > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > wangguoz@gmail.com
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > This
> > is
> > > a
> > > > > > great
> > > > > > > > > > > proposal,
> > > > > > > > > > > > > glad
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> see
> > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am
> > > > > inclined
> > > > > > to
> > > > > > > > the
> > > > > > > > > > CPU
> > > > > > > > > > > > > > > > >> throttling, or
> > > > > > > > > > > > > > > > >> > >> more
> > > > > > > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> ratio
> > > > > instead
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > > request
> > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> summed
> > > my
> > > > > > > > rationales
> > > > > > > > > > > > above,
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > has
> a
> > > good
> > > > > > > support
> > > > > > > > > for
> > > > > > > > > > > > both
> > > > > > > > > > > > > > > > >> "protecting
> > > > > > > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > "utilizing a
> > > > > > > > cluster
> > > > > > > > > > for
> > > > > > > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > explain
> > > > this
> > > > > > to
> > > > > > > > the
> > > > > > > > > > end
> > > > > > > > > > > > > > users, I
> > > > > > > > > > > > > > > > >> find
> > > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > request
> > > > rate
> > > > > > > since
> > > > > > > > > as
> > > > > > > > > > > > > > mentioned
> > > > > > > > > > > > > > > > >> above,
> > > > > > > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > different
> > > > > > > "cost",
> > > > > > > > > and
> > > > > > > > > > > > Kafka
> > > > > > > > > > > > > > > today
> > > > > > > > > > > > > > > > >> > already
> > > > > > > > > > > > > > > > >> > >> > have
> > > > > > > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > (produce,
> > > > > > fetch,
> > > > > > > > > > admin,
> > > > > > > > > > > > > > > metadata,
> > > > > > > > > > > > > > > > >> etc),
> > > > > > > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > throttling
> > > > > may
> > > > > > > not
> > > > > > > > > be
> > > > > > > > > > as
> > > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > conservatively.
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > Regarding
> > > > to
> > > > > > > user
> > > > > > > > > > > > reactions
> > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > > >> they
> > > > > > > > > > > > > > > > >> > are
> > > > > > > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > case-by-case,
> > > > > > > and
> > > > > > > > > need
> > > > > > > > > > > to
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > > >> > discovered /
> > > > > > > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > metrics.
> > > > So
> > > > > in
> > > > > > > > other
> > > > > > > > > > > words
> > > > > > > > > > > > > > users
> > > > > > > > > > > > > > > > >> would
> > > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > information
> > > > > by
> > > > > > > > > simply
> > > > > > > > > > > > being
> > > > > > > > > > > > > > told
> > > > > > > > > > > > > > > > >> "hey,
> > > > > > > > > > > > > > > > >> > >> you
> > > > > > > > > > > > > > > > >> > >> > are
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > what
> > > > > > throttling
> > > > > > > > > does;
> > > > > > > > > > > they
> > > > > > > > > > > > > > need
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > >> > take a
> > > > > > > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > throttled
> > > > > > > probably
> > > > > > > > > > > because
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > ..",
> > > > > > > > > > > > > > > > >> > which
> > > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > > >> > >> > by
> > > > > > > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > values:
> > > > e.g.
> > > > > > > > whether
> > > > > > > > > > I'm
> > > > > > > > > > > > > > > > bombarding
> > > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > [Message clipped]
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > *Todd Palino*
> > > > > > > Staff Site Reliability Engineer
> > > > > > > Data Infrastructure Streaming
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > linkedin.com/in/toddpalino
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Todd Palino*
> > > > > Staff Site Reliability Engineer
> > > > > Data Infrastructure Streaming
> > > > >
> > > > >
> > > > >
> > > > > linkedin.com/in/toddpalino
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Rajini,

The updated KIP looks good. Just one more comment.

40. "An additional metric exempt-request-time will also be added for each
quota entity for the quota type Request." Should that metric be added for
each entity type (e.g., user, client-id, etc)? It seems that value is
independent of entity types.

Thanks,

Jun

On Thu, Mar 9, 2017 at 12:07 PM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Hi Jun,
>
> Thank you for reviewing the KIP again.
>
> 30. That is a good idea. In fact, it is one of the advantages of measuring
> overall utilization rather than separate values for network and I/O threads
> as I had intended initially. Have updated the KIP, thanks.
>
> 31. Added exempt-request-time metric.
>
> 32. I had thought of using quota.window.size.seconds * quota.window.num
> initially, but felt that would be too big. Even the default of 11 seconds
> is a rather long time to be throttled. With a limit of
> quota.window.size.seconds, subsequent requests for that total interval of
> the samples will also each be throttled for quota.window.size.seconds if
> the time recorded was very high. So limiting at quota.window.size.seconds
> limits the throttle time for an individual request, avoiding timeouts where
> possible, but still throttles over a period of time.
>
> 33. Updated to use request_percentage.
>
>
> On Thu, Mar 9, 2017 at 5:40 PM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Rajini,
> >
> > Thanks for the updated KIP. A few more comments.
> >
> > 30. Should we just account for the time in network threads in this KIP
> too?
> > The issue with doing this later is that existing quotas may be too small
> > and everyone will have to adjust them before upgrading, which is
> > inconvenient. If we just do the delaying in the io threads, there
> probably
> > isn't too much additional work to include the network thread time?
> >
> > 31. It would be useful for the new metrics to capture the utilization of
> > all those requests exempt from request throttling (under sth like
> > "exempt"). It's useful for an admin to know how much time is spent there
> > too.
> >
> > 32. "The maximum throttle time for any single request will be the quota
> > window size (one second by default)." We probably should cap the delay at
> > quota.window.size.seconds * quota.window.num?
> >
> > 33. It's unfortunate that we use . in configs and _ in ZK data
> structures.
> > However, for consistency, request.percentage in ZK probably should be
> > request_percentage?
> >
> > Thanks,
> >
> > Jun
> >
> > On Thu, Mar 9, 2017 at 7:55 AM, Rajini Sivaram <ra...@gmail.com>
> > wrote:
> >
> > > I have updated the KIP to use "request.percentage" quotas where the
> > > percentage is out of a total of (num.io.threads * 100). I have added
> the
> > > other options considered so far under "Rejected Alternatives".
> > >
> > > To address Todd's concern about per-thread quotas: Even though the
> quotas
> > > are out of (num.io.threads * 100)  clients are not locked into threads.
> > > Utilization is measured as the total across all the I/O threads and 10
> %
> > > quota can be 1% of 10 threads. Individual quotas can also be greater
> than
> > > 100% if required.
> > >
> > > Please let me know if there are any other concerns or suggestions.
> > >
> > > Thank you,
> > >
> > > Rajini
> > >
> > > On Wed, Mar 8, 2017 at 10:20 PM, Todd Palino <tp...@gmail.com>
> wrote:
> > >
> > > > Rajini -
> > > >
> > > > I understand what you’re saying, but the point I’m making is that I
> > don’t
> > > > believe we need to take it into account directly. The CPU utilization
> > of
> > > > the network threads is directly proportional to the number of bytes
> > being
> > > > sent. The more bytes, the more CPU that is required for SSL (or other
> > > > tasks). This is opposed to the request handler threads, where there
> > are a
> > > > number of factors that affect CPU utilization. This means that it’s
> not
> > > > necessary to separately quota network thread byte usage and CPU - if
> we
> > > > quota byte usage (which we already do), we have fixed the CPU usage
> at
> > a
> > > > proportional amount.
> > > >
> > > > Jun -
> > > >
> > > > Thanks for the clarification there. I was thinking of the utilization
> > > > percentage as being fixed, not what the percentage reflects. I’m not
> > tied
> > > > to either way of doing it, provided that we do not lock clients to a
> > > single
> > > > thread. For example, if I specify that a given client can use 10% of
> a
> > > > single thread, that should also mean they can use 1% on 10 threads.
> > > >
> > > > -Todd
> > > >
> > > >
> > > >
> > > > On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Hi, Todd,
> > > > >
> > > > > Thanks for the feedback.
> > > > >
> > > > > I just want to clarify your second point. If the limit percentage
> is
> > > per
> > > > > thread and the thread counts are changed, the absolute processing
> > limit
> > > > for
> > > > > existing users haven't changed and there is no need to adjust them.
> > On
> > > > the
> > > > > other hand, if the limit percentage is of total thread pool
> capacity
> > > and
> > > > > the thread counts are changed, the effective processing limit for a
> > > user
> > > > > will change. So, to preserve the current processing limit, existing
> > > user
> > > > > limits have to be adjusted. If there is a hardware change, the
> > > effective
> > > > > processing limit for a user will change in either approach and the
> > > > existing
> > > > > limit may need to be adjusted. However, hardware changes are less
> > > common
> > > > > than thread pool configuration changes.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <tp...@gmail.com>
> > wrote:
> > > > >
> > > > > > I’ve been following this one on and off, and overall it sounds
> good
> > > to
> > > > > me.
> > > > > >
> > > > > > - The SSL question is a good one. However, that type of overhead
> > > should
> > > > > be
> > > > > > proportional to the bytes rate, so I think that a bytes rate
> quota
> > > > would
> > > > > > still be a suitable way to address it.
> > > > > >
> > > > > > - I think it’s better to make the quota percentage of total
> thread
> > > pool
> > > > > > capacity, and not percentage of an individual thread. That way
> you
> > > > don’t
> > > > > > have to adjust it when you adjust thread counts (tuning, hardware
> > > > > changes,
> > > > > > etc.)
> > > > > >
> > > > > >
> > > > > > -Todd
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <becket.qin@gmail.com
> >
> > > > wrote:
> > > > > >
> > > > > > > I see. Good point about SSL.
> > > > > > >
> > > > > > > I just asked Todd to take a look.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > > > > >
> > > > > > > > Hi, Jiangjie,
> > > > > > > >
> > > > > > > > Yes, I agree that byte rate already protects the network
> > threads
> > > > > > > > indirectly. I am not sure if byte rate fully captures the CPU
> > > > > overhead
> > > > > > in
> > > > > > > > network due to SSL. So, at the high level, we can use request
> > > time
> > > > > > limit
> > > > > > > to
> > > > > > > > protect CPU and use byte rate to protect storage and network.
> > > > > > > >
> > > > > > > > Also, do you think you can get Todd to comment on this KIP?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <
> > > becket.qin@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi Rajini/Jun,
> > > > > > > > >
> > > > > > > > > The percentage based reasoning sounds good.
> > > > > > > > > One thing I am wondering is that if we assume the network
> > > thread
> > > > > are
> > > > > > > just
> > > > > > > > > doing the network IO, can we say bytes rate quota is
> already
> > > sort
> > > > > of
> > > > > > > > > network threads quota?
> > > > > > > > > If we take network threads into the consideration here,
> would
> > > > that
> > > > > be
> > > > > > > > > somewhat overlapping with the bytes rate quota?
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > >
> > > > > > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > > > > > rajinisivaram@gmail.com
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Jun,
> > > > > > > > > >
> > > > > > > > > > Thank you for the explanation, I hadn't realized you
> meant
> > > > > > percentage
> > > > > > > > of
> > > > > > > > > > the total thread pool. If everyone is OK with Jun's
> > > > suggestion, I
> > > > > > > will
> > > > > > > > > > update the KIP.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Rajini
> > > > > > > > > >
> > > > > > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <
> jun@confluent.io>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > >
> > > > > > > > > > > Let's take your example. Let's say a user sets the
> limit
> > to
> > > > > 50%.
> > > > > > I
> > > > > > > am
> > > > > > > > > not
> > > > > > > > > > > sure if it's better to apply the same percentage
> > separately
> > > > to
> > > > > > > > network
> > > > > > > > > > and
> > > > > > > > > > > io thread pool. For example, for produce requests, most
> > of
> > > > the
> > > > > > time
> > > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > spent in the io threads whereas for fetch requests,
> most
> > of
> > > > the
> > > > > > > time
> > > > > > > > > will
> > > > > > > > > > > be in the network threads. So, using the same
> percentage
> > in
> > > > > both
> > > > > > > > thread
> > > > > > > > > > > pools means one of the pools' resource will be over
> > > > allocated.
> > > > > > > > > > >
> > > > > > > > > > > An alternative way is to simply model network and io
> > thread
> > > > > pool
> > > > > > > > > > together.
> > > > > > > > > > > If you get 10 io threads and 5 network threads, you get
> > > 1500%
> > > > > > > request
> > > > > > > > > > > processing power. A 50% limit means a total of 750%
> > > > processing
> > > > > > > power.
> > > > > > > > > We
> > > > > > > > > > > just add up the time a user request spent in either
> > network
> > > > or
> > > > > io
> > > > > > > > > thread.
> > > > > > > > > > > If that total exceeds 750% (doesn't matter whether it's
> > > spent
> > > > > > more
> > > > > > > in
> > > > > > > > > > > network or io thread), the request will be throttled.
> > This
> > > > > seems
> > > > > > > more
> > > > > > > > > > > general and is not sensitive to the current
> > implementation
> > > > > detail
> > > > > > > of
> > > > > > > > > > having
> > > > > > > > > > > a separate network and io thread pool. In the future,
> if
> > > the
> > > > > > > > threading
> > > > > > > > > > > model changes, the same concept of quota can still be
> > > > applied.
> > > > > > For
> > > > > > > > now,
> > > > > > > > > > > since it's a bit tricky to add the delay logic in the
> > > network
> > > > > > > thread
> > > > > > > > > > pool,
> > > > > > > > > > > we could probably just do the delaying only in the io
> > > threads
> > > > > as
> > > > > > > you
> > > > > > > > > > > suggested earlier.
> > > > > > > > > > >
> > > > > > > > > > > There is still the orthogonal question of whether a
> quota
> > > of
> > > > > 50%
> > > > > > is
> > > > > > > > out
> > > > > > > > > > of
> > > > > > > > > > > 100% or 100% * #total processing threads. My feeling is
> > > that
> > > > > the
> > > > > > > > latter
> > > > > > > > > > is
> > > > > > > > > > > slightly better based on my explanation earlier. The
> way
> > to
> > > > > > > describe
> > > > > > > > > this
> > > > > > > > > > > quota to the users can be "share of elapsed request
> > > > processing
> > > > > > time
> > > > > > > > on
> > > > > > > > > a
> > > > > > > > > > > single CPU" (similar to top).
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Jun,
> > > > > > > > > > > >
> > > > > > > > > > > > Agree about the two scenarios.
> > > > > > > > > > > >
> > > > > > > > > > > > But still not sure about a single quota covering both
> > > > network
> > > > > > > > threads
> > > > > > > > > > and
> > > > > > > > > > > > I/O threads with per-thread quota. If there are 10
> I/O
> > > > > threads
> > > > > > > and
> > > > > > > > 5
> > > > > > > > > > > > network threads and I want to assign half the quota
> to
> > > > userA,
> > > > > > the
> > > > > > > > > quota
> > > > > > > > > > > > would be 750%. I imagine, internally, we would
> convert
> > > this
> > > > > to
> > > > > > > 500%
> > > > > > > > > for
> > > > > > > > > > > I/O
> > > > > > > > > > > > and 250% for network threads to allocate 50% of each
> > > pool.
> > > > > > > > > > > >
> > > > > > > > > > > > A couple of scenarios:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. Admin adds 1 extra network thread. To retain 50%,
> > > admin
> > > > > > needs
> > > > > > > to
> > > > > > > > > now
> > > > > > > > > > > > allocate 800% for each user. Or increase the quota
> for
> > a
> > > > few
> > > > > > > users.
> > > > > > > > > To
> > > > > > > > > > > me,
> > > > > > > > > > > > it feels like admin needs to convert 50% to 800% and
> > > Kafka
> > > > > > > > internally
> > > > > > > > > > > needs
> > > > > > > > > > > > to convert 800% to (500%, 300%). Everyone using just
> > 50%
> > > > > feels
> > > > > > a
> > > > > > > > lot
> > > > > > > > > > > > simpler.
> > > > > > > > > > > >
> > > > > > > > > > > > 2. We decide to add some other thread to this list.
> > Admin
> > > > > needs
> > > > > > > to
> > > > > > > > > know
> > > > > > > > > > > > exactly how many threads form the maximum quota. And
> we
> > > can
> > > > > be
> > > > > > > > > changing
> > > > > > > > > > > > this between broker versions as we add more to the
> > list.
> > > > > Again
> > > > > > a
> > > > > > > > > single
> > > > > > > > > > > > overall percent would be a lot simpler.
> > > > > > > > > > > >
> > > > > > > > > > > > There were others who were unconvinced by a single
> > > percent
> > > > > from
> > > > > > > the
> > > > > > > > > > > initial
> > > > > > > > > > > > proposal and were happier with thread units similar
> to
> > > CPU
> > > > > > units,
> > > > > > > > so
> > > > > > > > > I
> > > > > > > > > > am
> > > > > > > > > > > > ok with going with per-thread quotas (as units or
> > > percent).
> > > > > > Just
> > > > > > > > not
> > > > > > > > > > sure
> > > > > > > > > > > > it makes it easier for admin in all cases.
> > > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > >
> > > > > > > > > > > > Rajini
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <
> > > jun@confluent.io>
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Consider modeling as n * 100% unit. For 2), the
> > > question
> > > > is
> > > > > > > > what's
> > > > > > > > > > > > causing
> > > > > > > > > > > > > the I/O threads to be saturated. It's unlikely that
> > all
> > > > > > users'
> > > > > > > > > > > > utilization
> > > > > > > > > > > > > have increased at the same. A more likely case is
> > that
> > > a
> > > > > few
> > > > > > > > > isolated
> > > > > > > > > > > > > users' utilization have increased. If so, after
> > > > increasing
> > > > > > the
> > > > > > > > > number
> > > > > > > > > > > of
> > > > > > > > > > > > > threads, the admin just needs to adjust the quota
> > for a
> > > > few
> > > > > > > > > isolated
> > > > > > > > > > > > users,
> > > > > > > > > > > > > which is expected and is less work.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Consider modeling as 1 * 100% unit. For 1), all
> > users'
> > > > > quota
> > > > > > > need
> > > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > > adjusted, which is unexpected and is more work.
> > > > > > > > > > > > >
> > > > > > > > > > > > > So, to me, the n * 100% model seems more
> convenient.
> > > > > > > > > > > > >
> > > > > > > > > > > > > As for future extension to cover network thread
> > > > > utilization,
> > > > > > I
> > > > > > > > was
> > > > > > > > > > > > thinking
> > > > > > > > > > > > > that one way is to simply model the capacity as (n
> +
> > > m) *
> > > > > > 100%
> > > > > > > > > unit,
> > > > > > > > > > > > where
> > > > > > > > > > > > > n and m are the number of network and i/o threads,
> > > > > > > respectively.
> > > > > > > > > > Then,
> > > > > > > > > > > > for
> > > > > > > > > > > > > each user, we can just add up the utilization in
> the
> > > > > network
> > > > > > > and
> > > > > > > > > the
> > > > > > > > > > > i/o
> > > > > > > > > > > > > thread. If we do this, we don't need a new type of
> > > quota.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jun
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Jun,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If we use request.percentage as the percentage
> used
> > > in
> > > > a
> > > > > > > single
> > > > > > > > > I/O
> > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > the total percentage being allocated will be
> > > > > > num.io.threads *
> > > > > > > > 100
> > > > > > > > > > for
> > > > > > > > > > > > I/O
> > > > > > > > > > > > > > threads and num.network.threads * 100 for network
> > > > > threads.
> > > > > > A
> > > > > > > > > single
> > > > > > > > > > > > quota
> > > > > > > > > > > > > > covering the two as a percentage wouldn't quite
> > work
> > > if
> > > > > you
> > > > > > > > want
> > > > > > > > > to
> > > > > > > > > > > > > > allocate the same proportion in both cases. If we
> > > want
> > > > to
> > > > > > > treat
> > > > > > > > > > > threads
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > separate units, won't we need two quota
> > > configurations
> > > > > > > > regardless
> > > > > > > > > > of
> > > > > > > > > > > > > > whether we use units or percentage? Perhaps I
> > > > > misunderstood
> > > > > > > > your
> > > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think there are two cases:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >    1. The use case that you mentioned where an
> > admin
> > > is
> > > > > > > adding
> > > > > > > > > more
> > > > > > > > > > > > users
> > > > > > > > > > > > > >    and decides to add more I/O threads and
> expects
> > to
> > > > > find
> > > > > > > free
> > > > > > > > > > quota
> > > > > > > > > > > > to
> > > > > > > > > > > > > >    allocate for new users.
> > > > > > > > > > > > > >    2. Admin adds more I/O threads because the I/O
> > > > threads
> > > > > > are
> > > > > > > > > > > saturated
> > > > > > > > > > > > > and
> > > > > > > > > > > > > >    there are cores available to allocate, even
> > though
> > > > the
> > > > > > > > number
> > > > > > > > > or
> > > > > > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If we allocated treated I/O threads as a single
> > unit
> > > of
> > > > > > 100%,
> > > > > > > > all
> > > > > > > > > > > user
> > > > > > > > > > > > > > quotas need to be reallocated for 1). If we
> > allocated
> > > > I/O
> > > > > > > > threads
> > > > > > > > > > as
> > > > > > > > > > > n
> > > > > > > > > > > > > > units with n*100%, all user quotas need to be
> > > > reallocated
> > > > > > for
> > > > > > > > 2),
> > > > > > > > > > > > > otherwise
> > > > > > > > > > > > > > some of the new threads may just not be used.
> > Either
> > > > way
> > > > > it
> > > > > > > > > should
> > > > > > > > > > be
> > > > > > > > > > > > > easy
> > > > > > > > > > > > > > to write a script to decrease/increase quotas by
> a
> > > > > multiple
> > > > > > > for
> > > > > > > > > all
> > > > > > > > > > > > > users.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > So it really boils down to which quota unit is
> most
> > > > > > intuitive
> > > > > > > > in
> > > > > > > > > > > terms
> > > > > > > > > > > > of
> > > > > > > > > > > > > > configuration. And from the discussion so far, it
> > > feels
> > > > > > like
> > > > > > > > > > opinion
> > > > > > > > > > > is
> > > > > > > > > > > > > > divided on whether quotas should be carved out of
> > an
> > > > > > absolute
> > > > > > > > > 100%
> > > > > > > > > > > (or
> > > > > > > > > > > > 1
> > > > > > > > > > > > > > unit) or be relative to the number of threads
> > (n*100%
> > > > or
> > > > > n
> > > > > > > > > units).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <
> > > > > jun@confluent.io>
> > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Another way to express an absolute limit is to
> > use
> > > > > > > > > > > > request.percentage,
> > > > > > > > > > > > > > but
> > > > > > > > > > > > > > > treat it as the percentage used in a single
> > request
> > > > > > > handling
> > > > > > > > > > > thread.
> > > > > > > > > > > > > For
> > > > > > > > > > > > > > > now, the request handling threads can be just
> the
> > > io
> > > > > > > threads.
> > > > > > > > > In
> > > > > > > > > > > the
> > > > > > > > > > > > > > > future, they can cover the network threads as
> > well.
> > > > > This
> > > > > > is
> > > > > > > > > > similar
> > > > > > > > > > > > to
> > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > top reports CPU usage and may be a bit easier
> for
> > > > > people
> > > > > > to
> > > > > > > > > > > > understand.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <
> > > > > > > jun@confluent.io>
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2. Regarding request.unit vs
> > request.percentage.
> > > I
> > > > > > > started
> > > > > > > > > with
> > > > > > > > > > > > > > > > request.percentage too. The reasoning for
> > > > > request.unit
> > > > > > is
> > > > > > > > the
> > > > > > > > > > > > > > following.
> > > > > > > > > > > > > > > > Suppose that the capacity has been reached
> on a
> > > > > broker
> > > > > > > and
> > > > > > > > > the
> > > > > > > > > > > > admin
> > > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > > to add a new user. A simple way to increase
> the
> > > > > > capacity
> > > > > > > is
> > > > > > > > > to
> > > > > > > > > > > > > increase
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > number of io threads, assuming there are
> still
> > > > enough
> > > > > > > > cores.
> > > > > > > > > If
> > > > > > > > > > > the
> > > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > > is based on percentage, the additional
> capacity
> > > > > > > > automatically
> > > > > > > > > > > gets
> > > > > > > > > > > > > > > > distributed to existing users and we haven't
> > > really
> > > > > > > carved
> > > > > > > > > out
> > > > > > > > > > > any
> > > > > > > > > > > > > > > > additional resource for the new user. Now, is
> > it
> > > > easy
> > > > > > > for a
> > > > > > > > > > user
> > > > > > > > > > > to
> > > > > > > > > > > > > > > reason
> > > > > > > > > > > > > > > > about 0.1 unit vs 10%. My feeling is that
> both
> > > are
> > > > > hard
> > > > > > > and
> > > > > > > > > > have
> > > > > > > > > > > to
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > configured empirically. Not sure if
> percentage
> > is
> > > > > > > obviously
> > > > > > > > > > > easier
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > reason about.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <
> > > > > > > > jay@confluent.io
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> 1. Even though the implementation of this
> > quota
> > > is
> > > > > > only
> > > > > > > > > using
> > > > > > > > > > io
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > >> time, i think we should call it something
> like
> > > > > > > > > "request-time".
> > > > > > > > > > > > This
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > >> give us flexibility to improve the
> > > implementation
> > > > to
> > > > > > > cover
> > > > > > > > > > > network
> > > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > >> in the future and will avoid exposing
> internal
> > > > > details
> > > > > > > > like
> > > > > > > > > > our
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> 2. Jun/Roger, I get what you are trying to
> fix
> > > but
> > > > > the
> > > > > > > > idea
> > > > > > > > > of
> > > > > > > > > > > > > > > >> thread/units
> > > > > > > > > > > > > > > >> is super unintuitive as a user-facing knob.
> I
> > > had
> > > > to
> > > > > > > read
> > > > > > > > > the
> > > > > > > > > > > KIP
> > > > > > > > > > > > > like
> > > > > > > > > > > > > > > >> eight times to understand this. I'm not sure
> > > that
> > > > > your
> > > > > > > > point
> > > > > > > > > > > that
> > > > > > > > > > > > > > > >> increasing the number of threads is a
> problem
> > > > with a
> > > > > > > > > > > > > percentage-based
> > > > > > > > > > > > > > > >> value, it really depends on whether the user
> > > > thinks
> > > > > > > about
> > > > > > > > > the
> > > > > > > > > > > > > > > "percentage
> > > > > > > > > > > > > > > >> of request processing time" or "thread
> units".
> > > If
> > > > > they
> > > > > > > > think
> > > > > > > > > > "I
> > > > > > > > > > > > have
> > > > > > > > > > > > > > > >> allocated 10% of my request processing time
> to
> > > > user
> > > > > x"
> > > > > > > > then
> > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > a
> > > > > > > > > > > > > > bug
> > > > > > > > > > > > > > > >> that increasing the thread count decreases
> > that
> > > > > > percent
> > > > > > > as
> > > > > > > > > it
> > > > > > > > > > > does
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> current proposal. As a practical matter I
> > think
> > > > the
> > > > > > only
> > > > > > > > way
> > > > > > > > > > to
> > > > > > > > > > > > > > actually
> > > > > > > > > > > > > > > >> reason about this is as a percent---I just
> > don't
> > > > > > believe
> > > > > > > > > > people
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > going
> > > > > > > > > > > > > > > >> to think, "ah, 4.3 thread units, that is the
> > > right
> > > > > > > > amount!".
> > > > > > > > > > > > > Instead I
> > > > > > > > > > > > > > > >> think they have to understand this thread
> unit
> > > > > > concept,
> > > > > > > > > figure
> > > > > > > > > > > out
> > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > >> they have set in number of threads, compute
> a
> > > > > percent
> > > > > > > and
> > > > > > > > > then
> > > > > > > > > > > > come
> > > > > > > > > > > > > up
> > > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > > >> the number of thread units, and these will
> all
> > > be
> > > > > > wrong
> > > > > > > if
> > > > > > > > > > that
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > >> count changes. I also think this ties us to
> > > > > throttling
> > > > > > > the
> > > > > > > > > I/O
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > >> pool,
> > > > > > > > > > > > > > > >> which may not be where we want to end up.
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> 3. For what it's worth I do think having a
> > > single
> > > > > > > > > throttle_ms
> > > > > > > > > > > > field
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > all
> > > > > > > > > > > > > > > >> the responses that combines all throttling
> > from
> > > > all
> > > > > > > quotas
> > > > > > > > > is
> > > > > > > > > > > > > probably
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> simplest. There could be a use case for
> having
> > > > > > separate
> > > > > > > > > fields
> > > > > > > > > > > for
> > > > > > > > > > > > > > each,
> > > > > > > > > > > > > > > >> but I think that is actually harder to
> > > use/monitor
> > > > > in
> > > > > > > the
> > > > > > > > > > common
> > > > > > > > > > > > > case
> > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > >> unless someone has a use case I think just
> one
> > > > > should
> > > > > > be
> > > > > > > > > fine.
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini
> > Sivaram
> > > <
> > > > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > >> > I have updated the KIP based on the
> > > discussions
> > > > so
> > > > > > > far.
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > Regards,
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > Rajini
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini
> > > > Sivaram <
> > > > > > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > Ismael #1. It makes sense not to
> throttle
> > > > > > > inter-broker
> > > > > > > > > > > > requests
> > > > > > > > > > > > > > like
> > > > > > > > > > > > > > > >> > > LeaderAndIsr etc. The simplest way to
> > ensure
> > > > > that
> > > > > > > > > clients
> > > > > > > > > > > > cannot
> > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > >> > these
> > > > > > > > > > > > > > > >> > > requests to bypass quotas for DoS
> attacks
> > is
> > > > to
> > > > > > > ensure
> > > > > > > > > > that
> > > > > > > > > > > > ACLs
> > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > >> > > clients from using these requests and
> > > > > unauthorized
> > > > > > > > > > requests
> > > > > > > > > > > > are
> > > > > > > > > > > > > > > >> included
> > > > > > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking that
> > > these
> > > > > > quotas
> > > > > > > > can
> > > > > > > > > > > > return
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > >> > separate
> > > > > > > > > > > > > > > >> > > throttle time, and all utilization based
> > > > quotas
> > > > > > > could
> > > > > > > > > use
> > > > > > > > > > > the
> > > > > > > > > > > > > same
> > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > >> > > (we won't add another one for network
> > thread
> > > > > > > > utilization
> > > > > > > > > > for
> > > > > > > > > > > > > > > >> instance).
> > > > > > > > > > > > > > > >> > But
> > > > > > > > > > > > > > > >> > > perhaps it makes sense to keep byte rate
> > > > quotas
> > > > > > > > separate
> > > > > > > > > > in
> > > > > > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > > > > > >> > > responses to provide separate metrics?
> > Agree
> > > > > with
> > > > > > > > Ismael
> > > > > > > > > > > that
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> name of
> > > > > > > > > > > > > > > >> > > the existing field should be changed if
> we
> > > > have
> > > > > > two.
> > > > > > > > > Happy
> > > > > > > > > > > to
> > > > > > > > > > > > > > switch
> > > > > > > > > > > > > > > >> to a
> > > > > > > > > > > > > > > >> > > single combined throttle time if that is
> > > > > > sufficient.
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will
> > use
> > > > dot
> > > > > > > > > separated
> > > > > > > > > > > > name
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > >> new
> > > > > > > > > > > > > > > >> > > property. Replication quotas use dot
> > > > separated,
> > > > > so
> > > > > > > it
> > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > > > > > >> consistent
> > > > > > > > > > > > > > > >> > > with all properties except byte rate
> > quotas.
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > Radai: #1 Request processing time rather
> > > than
> > > > > > > request
> > > > > > > > > rate
> > > > > > > > > > > > were
> > > > > > > > > > > > > > > chosen
> > > > > > > > > > > > > > > >> > > because the time per request can vary
> > > > > > significantly
> > > > > > > > > > between
> > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > > > > > > > > >> > > #2 Two separate quotas for
> > > heartbeats/regular
> > > > > > > requests
> > > > > > > > > > feel
> > > > > > > > > > > > like
> > > > > > > > > > > > > > > more
> > > > > > > > > > > > > > > >> > > configuration and more metrics. Since
> most
> > > > users
> > > > > > > would
> > > > > > > > > set
> > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > >> higher
> > > > > > > > > > > > > > > >> > > than the expected usage and quotas are
> > more
> > > > of a
> > > > > > > > safety
> > > > > > > > > > > net, a
> > > > > > > > > > > > > > > single
> > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > > > > > >> > >  #3 The number of requests in purgatory
> is
> > > > > limited
> > > > > > > by
> > > > > > > > > the
> > > > > > > > > > > > number
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > >> > active
> > > > > > > > > > > > > > > >> > > connections since only one request per
> > > > > connection
> > > > > > > will
> > > > > > > > > be
> > > > > > > > > > > > > > throttled
> > > > > > > > > > > > > > > >> at a
> > > > > > > > > > > > > > > >> > > time.
> > > > > > > > > > > > > > > >> > > #4 As with byte rate quotas, to use the
> > full
> > > > > > > allocated
> > > > > > > > > > > quotas,
> > > > > > > > > > > > > > > >> > > clients/users would need to use
> partitions
> > > > that
> > > > > > are
> > > > > > > > > > > > distributed
> > > > > > > > > > > > > > > across
> > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > >> > > cluster. The alternative of using
> > > cluster-wide
> > > > > > > quotas
> > > > > > > > > > > instead
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > >> > per-broker
> > > > > > > > > > > > > > > >> > > quotas would be far too complex to
> > > implement.
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > Dong : We currently have two
> > > > ClientQuotaManagers
> > > > > > for
> > > > > > > > > quota
> > > > > > > > > > > > types
> > > > > > > > > > > > > > > Fetch
> > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > >> > > Produce. A new one will be added for
> > > IOThread,
> > > > > > which
> > > > > > > > > > manages
> > > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > >> > I/O
> > > > > > > > > > > > > > > >> > > thread utilization. This will not update
> > the
> > > > > Fetch
> > > > > > > or
> > > > > > > > > > > Produce
> > > > > > > > > > > > > > > >> queue-size,
> > > > > > > > > > > > > > > >> > > but will have a separate metric for the
> > > > > > > queue-size.  I
> > > > > > > > > > > wasn't
> > > > > > > > > > > > > > > >> planning to
> > > > > > > > > > > > > > > >> > > add any additional metrics apart from
> the
> > > > > > equivalent
> > > > > > > > > ones
> > > > > > > > > > > for
> > > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > >> > > quotas as part of this KIP. Ratio of
> > > byte-rate
> > > > > to
> > > > > > > I/O
> > > > > > > > > > thread
> > > > > > > > > > > > > > > >> utilization
> > > > > > > > > > > > > > > >> > > could be slightly misleading since it
> > > depends
> > > > on
> > > > > > the
> > > > > > > > > > > sequence
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > >> > requests.
> > > > > > > > > > > > > > > >> > > But we can look into more metrics after
> > the
> > > > KIP
> > > > > is
> > > > > > > > > > > implemented
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > >> > required.
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > I think we need to limit the maximum
> delay
> > > > since
> > > > > > all
> > > > > > > > > > > requests
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > > >> > > throttled. If a client has a quota of
> > 0.001
> > > > > units
> > > > > > > and
> > > > > > > > a
> > > > > > > > > > > single
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > >> > used
> > > > > > > > > > > > > > > >> > > 50ms, we don't want to delay all
> requests
> > > from
> > > > > the
> > > > > > > > > client
> > > > > > > > > > by
> > > > > > > > > > > > 50
> > > > > > > > > > > > > > > >> seconds,
> > > > > > > > > > > > > > > >> > > throwing the client out of all its
> > consumer
> > > > > > groups.
> > > > > > > > The
> > > > > > > > > > > issue
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > only
> > > > > > > > > > > > > > > >> if
> > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > >> > > user is allocated a quota that is
> > > insufficient
> > > > > to
> > > > > > > > > process
> > > > > > > > > > > one
> > > > > > > > > > > > > > large
> > > > > > > > > > > > > > > >> > > request. The expectation is that the
> units
> > > > > > allocated
> > > > > > > > per
> > > > > > > > > > > user
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > >> > much
> > > > > > > > > > > > > > > >> > > higher than the time taken to process
> one
> > > > > request
> > > > > > > and
> > > > > > > > > the
> > > > > > > > > > > > limit
> > > > > > > > > > > > > > > should
> > > > > > > > > > > > > > > >> > > seldom be applied. Agree this needs
> proper
> > > > > > > > > documentation.
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > > >> > >> @jun: i wasnt concerned about tying up
> a
> > > > > request
> > > > > > > > > > processing
> > > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > > >> but
> > > > > > > > > > > > > > > >> > >> IIUC the code does still read the
> entire
> > > > > request
> > > > > > > out,
> > > > > > > > > > which
> > > > > > > > > > > > > might
> > > > > > > > > > > > > > > >> add-up
> > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong
> > Lin
> > > <
> > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > >> > >> > The current KIP says that the maximum
> > > delay
> > > > > > will
> > > > > > > be
> > > > > > > > > > > reduced
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> window
> > > > > > > > > > > > > > > >> > >> size
> > > > > > > > > > > > > > > >> > >> > if it is larger than the window
> size. I
> > > > have
> > > > > a
> > > > > > > > > concern
> > > > > > > > > > > with
> > > > > > > > > > > > > > this:
> > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > >> > >> > 1) This essentially means that the
> user
> > > is
> > > > > > > allowed
> > > > > > > > to
> > > > > > > > > > > > exceed
> > > > > > > > > > > > > > > their
> > > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > > >> > >> > over a long period of time. Can you
> > > provide
> > > > > an
> > > > > > > > upper
> > > > > > > > > > > bound
> > > > > > > > > > > > on
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > >> > >> > 2) What is the motivation for cap the
> > > > maximum
> > > > > > > delay
> > > > > > > > > by
> > > > > > > > > > > the
> > > > > > > > > > > > > > window
> > > > > > > > > > > > > > > >> > size?
> > > > > > > > > > > > > > > >> > >> I
> > > > > > > > > > > > > > > >> > >> > am wondering if there is better
> > > alternative
> > > > > to
> > > > > > > > > address
> > > > > > > > > > > the
> > > > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > >> > >> > 3) It means that the existing
> > > > metric-related
> > > > > > > config
> > > > > > > > > > will
> > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > >> > >> > directly impact on the mechanism of
> > this
> > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > >> quota.
> > > > > > > > > > > > > > > >> > The
> > > > > > > > > > > > > > > >> > >> > may be an important change depending
> on
> > > the
> > > > > > > answer
> > > > > > > > to
> > > > > > > > > > 1)
> > > > > > > > > > > > > above.
> > > > > > > > > > > > > > > We
> > > > > > > > > > > > > > > >> > >> probably
> > > > > > > > > > > > > > > >> > >> > need to document this more
> explicitly.
> > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM,
> Dong
> > > Lin
> > > > <
> > > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > > Yeah you are right. I thought it
> > wasn't
> > > > > > because
> > > > > > > > at
> > > > > > > > > > > > LinkedIn
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > >> will
> > > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > >> > >> > > much pressure on inGraph to expose
> > > those
> > > > > > > > > per-clientId
> > > > > > > > > > > > > metrics
> > > > > > > > > > > > > > > so
> > > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > > >> > >> ended
> > > > > > > > > > > > > > > >> > >> > > up printing them periodically to
> > local
> > > > log.
> > > > > > > Never
> > > > > > > > > > mind
> > > > > > > > > > > if
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > >> not
> > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > > - I agree with Jay that we probably
> > > don't
> > > > > > want
> > > > > > > to
> > > > > > > > > > add a
> > > > > > > > > > > > new
> > > > > > > > > > > > > > > field
> > > > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > > > >> > >> > > every quota ProduceResponse or
> > > > > FetchResponse.
> > > > > > > Is
> > > > > > > > > > there
> > > > > > > > > > > > any
> > > > > > > > > > > > > > > >> use-case
> > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > >> > >> > > having separate throttle-time
> fields
> > > for
> > > > > > > > > > > byte-rate-quota
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > >> > >> > > io-thread-unit-quota? You probably
> > need
> > > > to
> > > > > > > > document
> > > > > > > > > > > this
> > > > > > > > > > > > as
> > > > > > > > > > > > > > > >> > interface
> > > > > > > > > > > > > > > >> > >> > > change if you plan to add new field
> > in
> > > > any
> > > > > > > > request.
> > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > > - I don't think IOThread belongs to
> > > > > > quotaType.
> > > > > > > > The
> > > > > > > > > > > > existing
> > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > >> > >> types
> > > > > > > > > > > > > > > >> > >> > > (i.e.
> Produce/Fetch/LeaderReplicatio
> > > > > > > > > > > > n/FollowerReplication)
> > > > > > > > > > > > > > > >> identify
> > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > >> > >> > > type of request that are throttled,
> > not
> > > > the
> > > > > > > quota
> > > > > > > > > > > > mechanism
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > > - If a request is throttled due to
> > this
> > > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > > >> quota,
> > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > > existing queue-size metric in
> > > > > > > ClientQuotaManager
> > > > > > > > > > > > > incremented?
> > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > > - In the interest of providing
> guide
> > > line
> > > > > for
> > > > > > > > admin
> > > > > > > > > > to
> > > > > > > > > > > > > decide
> > > > > > > > > > > > > > > >> > >> > > io-thread-unit-based quota and for
> > user
> > > > to
> > > > > > > > > understand
> > > > > > > > > > > its
> > > > > > > > > > > > > > > impact
> > > > > > > > > > > > > > > >> on
> > > > > > > > > > > > > > > >> > >> their
> > > > > > > > > > > > > > > >> > >> > > traffic, would it be useful to
> have a
> > > > > metric
> > > > > > > that
> > > > > > > > > > shows
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> overall
> > > > > > > > > > > > > > > >> > >> > > byte-rate per io-thread-unit? Can
> we
> > > also
> > > > > > show
> > > > > > > > > this a
> > > > > > > > > > > > > > > >> per-clientId
> > > > > > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM,
> Jun
> > > Rao
> > > > <
> > > > > > > > > > > > jun@confluent.io
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > >> > >> > >> For #3, typically, an admin won't
> > > > > configure
> > > > > > > more
> > > > > > > > > io
> > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > than
> > > > > > > > > > > > > > > >> > CPU
> > > > > > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > > > > > >> > >> > >> but it's possible for an admin to
> > > start
> > > > > with
> > > > > > > > fewer
> > > > > > > > > > io
> > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > > >> > >> cores
> > > > > > > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > >> > >> > >> I think the throttleTime sensor on
> > the
> > > > > > broker
> > > > > > > > > tells
> > > > > > > > > > > the
> > > > > > > > > > > > > > admin
> > > > > > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > >> > >> > >> The reasoning for delaying the
> > > throttled
> > > > > > > > requests
> > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > > >> > >> instead
> > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > >> > >> > >> returning an error immediately is
> > that
> > > > the
> > > > > > > > latter
> > > > > > > > > > has
> > > > > > > > > > > no
> > > > > > > > > > > > > way
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > >> > >> > >> client from retrying immediately,
> > > which
> > > > > will
> > > > > > > > make
> > > > > > > > > > > things
> > > > > > > > > > > > > > > worse.
> > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > >> > >> > >> delaying logic is based off a
> delay
> > > > > queue. A
> > > > > > > > > > separate
> > > > > > > > > > > > > > > expiration
> > > > > > > > > > > > > > > >> > >> thread
> > > > > > > > > > > > > > > >> > >> > >> just waits on the next to be
> expired
> > > > > > request.
> > > > > > > > So,
> > > > > > > > > it
> > > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > tie
> > > > > > > > > > > > > > > >> > up a
> > > > > > > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM,
> > > Ismael
> > > > > > Juma <
> > > > > > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > >> > >> > >> > Regarding 1, I definitely like
> the
> > > > > > > simplicity
> > > > > > > > of
> > > > > > > > > > > > > keeping a
> > > > > > > > > > > > > > > >> single
> > > > > > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > > > > > >> > >> > >> > time field in the response. The
> > > > downside
> > > > > > is
> > > > > > > > that
> > > > > > > > > > the
> > > > > > > > > > > > > > client
> > > > > > > > > > > > > > > >> > metrics
> > > > > > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > > > > > percentage`
> > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.
> > ratio`.
> > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM,
> > Jay
> > > > > > Kreps <
> > > > > > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > >> > >    1. Isn't it the case that
> the
> > > > > > > throttling
> > > > > > > > > time
> > > > > > > > > > > > > > response
> > > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > > >> > >> > should
> > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > >> > >> > >> > >    the total time your request
> > was
> > > > > > > throttled
> > > > > > > > > > > > > > irrespective
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > > > > > >> > >> > >> > >    caused that. Limiting it to
> > > byte
> > > > > rate
> > > > > > > > quota
> > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > > make
> > > > > > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > >> > >> > >> > >    I don't think we want to
> end
> > up
> > > > > > adding
> > > > > > > > new
> > > > > > > > > > > fields
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> response
> > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > > > > > >> > >> > >> > >    single thing we quota,
> right?
> > > > > > > > > > > > > > > >> > >> > >> > >    2. I don't think we should
> > make
> > > > > this
> > > > > > > > quota
> > > > > > > > > > > > > > specifically
> > > > > > > > > > > > > > > >> > about
> > > > > > > > > > > > > > > >> > >> io
> > > > > > > > > > > > > > > >> > >> > >> > >    threads. Once we introduce
> > > these
> > > > > > quotas
> > > > > > > > > > people
> > > > > > > > > > > > set
> > > > > > > > > > > > > > them
> > > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > >> > >> > >> > >    be enforced (and if they
> > aren't
> > > > it
> > > > > > may
> > > > > > > > > cause
> > > > > > > > > > an
> > > > > > > > > > > > > > > outage).
> > > > > > > > > > > > > > > >> As
> > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > > > > > >> > >> > >> > >    are a bit more sensitive
> than
> > > > > normal
> > > > > > > > > > configs, I
> > > > > > > > > > > > > > think.
> > > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > > >> > >> > current
> > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > >> > >> > >> > >    pools seem like something
> of
> > an
> > > > > > > > > > implementation
> > > > > > > > > > > > > detail
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > >> > >> > >> > >    user-facing quotas should
> be
> > > > > involved
> > > > > > > > > with. I
> > > > > > > > > > > > think
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > >> might
> > > > > > > > > > > > > > > >> > >> be
> > > > > > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > >> > >> > >> > >    make this a general
> > > request-time
> > > > > > > throttle
> > > > > > > > > > with
> > > > > > > > > > > no
> > > > > > > > > > > > > > > >> mention in
> > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > > > > > >> > >> > >> > >    about I/O threads and
> simply
> > > > > > > acknowledge
> > > > > > > > > the
> > > > > > > > > > > > > current
> > > > > > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > > > > > >> > >> > >> > >    may someday fix) in the
> docs
> > > that
> > > > > > this
> > > > > > > > > covers
> > > > > > > > > > > > only
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > >> > >> after
> > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > >> > >> > >> > >    thread is read off the
> > network.
> > > > > > > > > > > > > > > >> > >> > >> > >    3. As such I think the
> right
> > > > > > interface
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > > user
> > > > > > > > > > > > > > > would
> > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > > > > > >> > >> > >> > >    like percent_request_time
> and
> > > be
> > > > in
> > > > > > > > > > {0,...100}
> > > > > > > > > > > or
> > > > > > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think
> > > "ratio"
> > > > > is
> > > > > > > the
> > > > > > > > > > > > > terminology
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > >> used
> > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > > > > > >> > >> > >> > >    is between 0 and 1 in the
> > other
> > > > > > > metrics,
> > > > > > > > > > > right?)
> > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45
> AM,
> > > > > Rajini
> > > > > > > > > Sivaram
> > > > > > > > > > <
> > > > > > > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > Guozhang : I have updated
> the
> > > > > section
> > > > > > on
> > > > > > > > > > > > > co-existence
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > >> byte
> > > > > > > > > > > > > > > >> > >> rate
> > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added much
> > detail
> > > > to
> > > > > > the
> > > > > > > > > > metrics
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > >> sensors
> > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > > > > > >> > >> > >> > > > going to be very similar to
> > the
> > > > > > existing
> > > > > > > > > > metrics
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > >> sensors.
> > > > > > > > > > > > > > > >> > >> To
> > > > > > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > > > > > >> > >> > >> > > > confusion, I have now added
> > more
> > > > > > detail.
> > > > > > > > All
> > > > > > > > > > > > metrics
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > > > > > >> > >> > >> > > > "quotaType" and all sensors
> > have
> > > > > names
> > > > > > > > > > starting
> > > > > > > > > > > > with
> > > > > > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/
> > > > > > > > > LeaderReplication/
> > > > > > > > > > > > > > > >> > >> > >> > > >
> FollowerReplication/*IOThread*
> > > ).
> > > > > > > > > > > > > > > >> > >> > >> > > > So there will be no reuse of
> > > > > existing
> > > > > > > > > > > > > metrics/sensors.
> > > > > > > > > > > > > > > The
> > > > > > > > > > > > > > > >> > new
> > > > > > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > > >> > >> > >> > > > request processing time
> based
> > > > > > throttling
> > > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > > > > > >> completely
> > > > > > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > >> > >> > >> > > > existing metrics/sensors,
> but
> > > will
> > > > > be
> > > > > > > > > > consistent
> > > > > > > > > > > > in
> > > > > > > > > > > > > > > >> format.
> > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > The existing
> throttle_time_ms
> > > > field
> > > > > in
> > > > > > > > > > > > produce/fetch
> > > > > > > > > > > > > > > >> > responses
> > > > > > > > > > > > > > > >> > >> > will
> > > > > > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > >> > >> > >> > > > impacted by this KIP. That
> > will
> > > > > > continue
> > > > > > > > to
> > > > > > > > > > > return
> > > > > > > > > > > > > > > >> byte-rate
> > > > > > > > > > > > > > > >> > >> based
> > > > > > > > > > > > > > > >> > >> > >> > > > throttling times. In
> > addition, a
> > > > new
> > > > > > > field
> > > > > > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > >> > >> > >> > > > added to return request
> quota
> > > > based
> > > > > > > > > throttling
> > > > > > > > > > > > > times.
> > > > > > > > > > > > > > > >> These
> > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > > > > > >> > >> > >> > > > as new metrics on the
> > > client-side.
> > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > Since all metrics and
> sensors
> > > are
> > > > > > > > different
> > > > > > > > > > for
> > > > > > > > > > > > each
> > > > > > > > > > > > > > > type
> > > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > > > > > >> > >> > >> > > > believe there is already
> > > > sufficient
> > > > > > > > metrics
> > > > > > > > > to
> > > > > > > > > > > > > monitor
> > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > > > > > >> > >> > >> > > > client and broker side for
> > each
> > > > type
> > > > > > of
> > > > > > > > > > > > throttling.
> > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32
> > AM,
> > > > > Dong
> > > > > > > Lin
> > > > > > > > <
> > > > > > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > I think it makes a lot of
> > > sense
> > > > to
> > > > > > use
> > > > > > > > > > > > > > io_thread_units
> > > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > > >> > >> metric
> > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > >> > >> > >> > > > > user's traffic here. LGTM
> > > > > overall. I
> > > > > > > > have
> > > > > > > > > > some
> > > > > > > > > > > > > > > questions
> > > > > > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > - Can you be more specific
> > in
> > > > the
> > > > > > KIP
> > > > > > > > what
> > > > > > > > > > > > sensors
> > > > > > > > > > > > > > > will
> > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > >> > >> > >> > > > > example, it will be useful
> > to
> > > > > > specify
> > > > > > > > the
> > > > > > > > > > name
> > > > > > > > > > > > and
> > > > > > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > - We currently have
> > > > throttle-time
> > > > > > and
> > > > > > > > > > > queue-size
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > > > > > >> > >> > >> > > > > Are you going to have
> > separate
> > > > > > > > > throttle-time
> > > > > > > > > > > and
> > > > > > > > > > > > > > > >> queue-size
> > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > > > > > >> > >> > >> > > > > throttled by
> > > > io_thread_unit-based
> > > > > > > quota,
> > > > > > > > > or
> > > > > > > > > > > will
> > > > > > > > > > > > > > they
> > > > > > > > > > > > > > > >> share
> > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > - Does the throttle-time
> in
> > > the
> > > > > > > > > > > ProduceResponse
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > > > > > >> > >> > >> > > > > time due to
> > > io_thread_unit-based
> > > > > > > quota?
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > - Currently kafka server
> > > doesn't
> > > > > not
> > > > > > > > > provide
> > > > > > > > > > > any
> > > > > > > > > > > > > log
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > > > > > >> > >> > >> > > > > whether any given clientId
> > (or
> > > > > user)
> > > > > > > is
> > > > > > > > > > > > throttled.
> > > > > > > > > > > > > > > This
> > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > > > > > >> > >> > >> > > > > because we can still check
> > the
> > > > > > > > client-side
> > > > > > > > > > > > > byte-rate
> > > > > > > > > > > > > > > >> metric
> > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > > > > > >> > >> > >> > > > > whether a given client is
> > > > > throttled.
> > > > > > > But
> > > > > > > > > > with
> > > > > > > > > > > > this
> > > > > > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > > > > > >> > >> > >> > > > > will be no way to validate
> > > > > whether a
> > > > > > > > given
> > > > > > > > > > > > client
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > >> slow
> > > > > > > > > > > > > > > >> > >> > because
> > > > > > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > > > > > >> > >> > >> > > > > exceeded its
> io_thread_unit
> > > > limit.
> > > > > > It
> > > > > > > is
> > > > > > > > > > > > necessary
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > >> user
> > > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > know this information to
> > > figure
> > > > > how
> > > > > > > > > whether
> > > > > > > > > > > they
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > >> > reached
> > > > > > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > >> > >> > >> > > > > limit. How about we add
> > log4j
> > > > log
> > > > > on
> > > > > > > the
> > > > > > > > > > > server
> > > > > > > > > > > > > side
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > > > > > > byte-rate-throttle-time,
> > > > > > > > > > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > > > > > >> > >> > >> > > > > that kafka administrator
> can
> > > > > figure
> > > > > > > > those
> > > > > > > > > > > users
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > >> have
> > > > > > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at
> 4:46
> > > PM,
> > > > > > > > Guozhang
> > > > > > > > > > > Wang <
> > > > > > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > Made a pass over the
> doc,
> > > > > overall
> > > > > > > LGTM
> > > > > > > > > > > except
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > minor
> > > > > > > > > > > > > > > >> > >> comment
> > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > throttling
> implementation:
> > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > Stated as "Request
> > > processing
> > > > > time
> > > > > > > > > > > throttling
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > > > > > >> > >> > >> > > > > > necessary." I thought
> that
> > > it
> > > > > > meant
> > > > > > > > the
> > > > > > > > > > > > request
> > > > > > > > > > > > > > > >> > processing
> > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > > > > > >> > >> > >> > > > > > is applied first, but
> > > continue
> > > > > > > > reading I
> > > > > > > > > > > found
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > >> > actually
> > > > > > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > > > > > >> > >> > >> > > > > > produce / fetch byte
> rate
> > > > > > throttling
> > > > > > > > > > first.
> > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > Also the last sentence
> > "The
> > > > > > > remaining
> > > > > > > > > > delay
> > > > > > > > > > > if
> > > > > > > > > > > > > any
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > response." is a bit
> > > confusing
> > > > to
> > > > > > me.
> > > > > > > > > Maybe
> > > > > > > > > > > > > > rewording
> > > > > > > > > > > > > > > >> it a
> > > > > > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at
> > 3:24
> > > > PM,
> > > > > > Jun
> > > > > > > > > Rao <
> > > > > > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > Thanks for the updated
> > > KIP.
> > > > > The
> > > > > > > > latest
> > > > > > > > > > > > > proposal
> > > > > > > > > > > > > > > >> looks
> > > > > > > > > > > > > > > >> > >> good
> > > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017
> at
> > > 2:19
> > > > > PM,
> > > > > > > > > Rajini
> > > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Thank you for the
> > > > feedback.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 1. I have updated
> the
> > > KIP
> > > > to
> > > > > > use
> > > > > > > > > > > absolute
> > > > > > > > > > > > > > units
> > > > > > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property is called*
> > > > > > > > io_thread_units*
> > > > > > > > > > to
> > > > > > > > > > > > > align
> > > > > > > > > > > > > > > with
> > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> > > *num.io.threads*.
> > > > > > When
> > > > > > > we
> > > > > > > > > > > > implement
> > > > > > > > > > > > > > > >> network
> > > > > > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas, we can add
> > > another
> > > > > > > > property
> > > > > > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 2.
> ControlledShutdown
> > is
> > > > > > already
> > > > > > > > > > listed
> > > > > > > > > > > > > under
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > you mean a different
> > > > request
> > > > > > > that
> > > > > > > > > > needs
> > > > > > > > > > > to
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > >> added?
> > > > > > > > > > > > > > > >> > >> The
> > > > > > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > currently exempt in
> > the
> > > > KIP
> > > > > > are
> > > > > > > > > > > > StopReplica,
> > > > > > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and
> > > > > > UpdateMetadata.
> > > > > > > > > These
> > > > > > > > > > > are
> > > > > > > > > > > > > > > >> controlled
> > > > > > > > > > > > > > > >> > >> > using
> > > > > > > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy
> to
> > > > > exclude
> > > > > > > and
> > > > > > > > > only
> > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > sure if there are
> > other
> > > > > > requests
> > > > > > > > > used
> > > > > > > > > > > only
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was thinking
> the
> > > > > smallest
> > > > > > > > > change
> > > > > > > > > > > > would
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> > >> replace
> > > > > > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > *requestChannel.sendResponse()
> > > > > > *
> > > > > > > > > with
> > > > > > > > > > a
> > > > > > > > > > > > > local
> > > > > > > > > > > > > > > >> method
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > *sendResponseMaybeThrottle()*
> > > > > > > that
> > > > > > > > > > does
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > throttling
> > > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > response. If we
> > throttle
> > > > > first
> > > > > > > in
> > > > > > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > within the method
> > > handling
> > > > > the
> > > > > > > > > request
> > > > > > > > > > > > will
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > be
> > > > > > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > throttling. We can
> > look
> > > > into
> > > > > > > this
> > > > > > > > > > again
> > > > > > > > > > > > when
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> PR
> > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017
> > at
> > > > 5:55
> > > > > > PM,
> > > > > > > > > Roger
> > > > > > > > > > > > > Hoover
> > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Great to see this
> > KIP
> > > > and
> > > > > > the
> > > > > > > > > > > excellent
> > > > > > > > > > > > > > > >> discussion.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's
> > > suggestion
> > > > > > makes
> > > > > > > > > sense.
> > > > > > > > > > > If
> > > > > > > > > > > > > my
> > > > > > > > > > > > > > > >> > >> application
> > > > > > > > > > > > > > > >> > >> > is
> > > > > > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request handler
> > unit,
> > > > then
> > > > > > > it's
> > > > > > > > as
> > > > > > > > > > if
> > > > > > > > > > > I
> > > > > > > > > > > > > > have a
> > > > > > > > > > > > > > > >> > Kafka
> > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request handler
> > thread
> > > > > > > dedicated
> > > > > > > > > to
> > > > > > > > > > > me.
> > > > > > > > > > > > > > > That's
> > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > least.  That
> > > allocation
> > > > > > > doesn't
> > > > > > > > > > change
> > > > > > > > > > > > > even
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > >> an
> > > > > > > > > > > > > > > >> > >> admin
> > > > > > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > size of the
> request
> > > > thread
> > > > > > > pool
> > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > > > > > broker.
> > > > > > > > > > > > > > > >> > It's
> > > > > > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > abstraction that
> VMs
> > > and
> > > > > > > > > containers
> > > > > > > > > > > get
> > > > > > > > > > > > > from
> > > > > > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > While different
> > client
> > > > > > access
> > > > > > > > > > patterns
> > > > > > > > > > > > can
> > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > >> > wildly
> > > > > > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request thread
> > > resources
> > > > > per
> > > > > > > > > > request,
> > > > > > > > > > > a
> > > > > > > > > > > > > > given
> > > > > > > > > > > > > > > >> > >> > application
> > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > have a stable
> access
> > > > > pattern
> > > > > > > and
> > > > > > > > > can
> > > > > > > > > > > > > figure
> > > > > > > > > > > > > > > out
> > > > > > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > "request thread
> > units"
> > > > it
> > > > > > > needs
> > > > > > > > to
> > > > > > > > > > > meet
> > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22,
> 2017
> > > at
> > > > > 8:53
> > > > > > > AM,
> > > > > > > > > Jun
> > > > > > > > > > > > Rao <
> > > > > > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for the
> > > updated
> > > > > > KIP.
> > > > > > > A
> > > > > > > > > few
> > > > > > > > > > > more
> > > > > > > > > > > > > > > >> comments.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> > > > > > > > > > request_time_percent
> > > > > > > > > > > > is
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > >> it's
> > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > >> > >> > an
> > > > > > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Let's say you
> > give a
> > > > > user
> > > > > > a
> > > > > > > > 10%
> > > > > > > > > > > limit.
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > admin
> > > > > > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request handler
> > > > threads,
> > > > > > > that
> > > > > > > > > user
> > > > > > > > > > > now
> > > > > > > > > > > > > > > >> actually
> > > > > > > > > > > > > > > >> > has
> > > > > > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > capacity. This
> may
> > > > > confuse
> > > > > > > > > people
> > > > > > > > > > a
> > > > > > > > > > > > bit.
> > > > > > > > > > > > > > So,
> > > > > > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > based on an
> > absolute
> > > > > > request
> > > > > > > > > > thread
> > > > > > > > > > > > unit
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > >> > better.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > > > > > ControlledShutdownRequest
> > > > > > > > is
> > > > > > > > > > also
> > > > > > > > > > > > an
> > > > > > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be excluded from
> > > > > > throttling.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 3.
> Implementation
> > > > wise,
> > > > > I
> > > > > > am
> > > > > > > > > > > wondering
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > time throttling
> > > first
> > > > in
> > > > > > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the throttling
> > logic
> > > > in
> > > > > > each
> > > > > > > > > type
> > > > > > > > > > of
> > > > > > > > > > > > > > > request.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22,
> > 2017
> > > > at
> > > > > > 5:58
> > > > > > > > AM,
> > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank you for
> > the
> > > > > > review.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I have
> reverted
> > to
> > > > the
> > > > > > > > > original
> > > > > > > > > > > KIP
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > utilization.
> At
> > > the
> > > > > > > moment,
> > > > > > > > it
> > > > > > > > > > > uses
> > > > > > > > > > > > > > > >> percentage,
> > > > > > > > > > > > > > > >> > >> but
> > > > > > > > > > > > > > > >> > >> > I
> > > > > > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > a fraction
> (out
> > > of 1
> > > > > > > instead
> > > > > > > > > of
> > > > > > > > > > > 100)
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > from this
> > > discussion
> > > > > to
> > > > > > > the
> > > > > > > > > KIP.
> > > > > > > > > > > > Also
> > > > > > > > > > > > > > > added
> > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > address
> network
> > > > thread
> > > > > > > > > > > utilization.
> > > > > > > > > > > > > The
> > > > > > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > "request_time_percent"
> > > > > > > with
> > > > > > > > > the
> > > > > > > > > > > > > > > expectation
> > > > > > > > > > > > > > > >> > that
> > > > > > > > > > > > > > > >> > >> it
> > > > > > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit for
> > network
> > > > > thread
> > > > > > > > > > > utilization
> > > > > > > > > > > > > > when
> > > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > users have to
> > set
> > > > only
> > > > > > one
> > > > > > > > > > config
> > > > > > > > > > > > for
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > > >> > >> > not
> > > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the internal
> > > > > > distribution
> > > > > > > of
> > > > > > > > > the
> > > > > > > > > > > > work
> > > > > > > > > > > > > > > >> between
> > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > >> > >> > two
> > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb
> 22,
> > > 2017
> > > > > at
> > > > > > > > 12:23
> > > > > > > > > > AM,
> > > > > > > > > > > > Jun
> > > > > > > > > > > > > > Rao
> > > > > > > > > > > > > > > <
> > > > > > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for
> the
> > > > > > proposal.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > The benefit
> of
> > > > using
> > > > > > the
> > > > > > > > > > request
> > > > > > > > > > > > > > > >> processing
> > > > > > > > > > > > > > > >> > >> time
> > > > > > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > exactly what
> > > > people
> > > > > > have
> > > > > > > > > > said. I
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > > just
> > > > > > > > > > > > > > > >> > >> expand
> > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > following
> > case.
> > > > The
> > > > > > > > producer
> > > > > > > > > > > > sends a
> > > > > > > > > > > > > > > >> produce
> > > > > > > > > > > > > > > >> > >> > request
> > > > > > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > but
> compressed
> > > to
> > > > > > 100KB
> > > > > > > > with
> > > > > > > > > > > gzip.
> > > > > > > > > > > > > The
> > > > > > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > broker could
> > > take
> > > > > > 10-15
> > > > > > > > > > seconds,
> > > > > > > > > > > > > > during
> > > > > > > > > > > > > > > >> which
> > > > > > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > thread is
> > > > completely
> > > > > > > > > blocked.
> > > > > > > > > > In
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > >> case,
> > > > > > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request
> > rate
> > > > > quota
> > > > > > > may
> > > > > > > > > be
> > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > another
> case.
> > A
> > > > > > consumer
> > > > > > > > > group
> > > > > > > > > > > > > starts
> > > > > > > > > > > > > > > >> with 10
> > > > > > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > switches to
> 20
> > > > > > > instances.
> > > > > > > > > The
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > >> > will
> > > > > > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > actually
> load
> > on
> > > > the
> > > > > > > > broker
> > > > > > > > > > may
> > > > > > > > > > > > not
> > > > > > > > > > > > > > > double
> > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > contains
> half
> > of
> > > > the
> > > > > > > > > > partitions.
> > > > > > > > > > > > > > Request
> > > > > > > > > > > > > > > >> rate
> > > > > > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure in
> > > this
> > > > > > case.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > What we
> really
> > > > want
> > > > > is
> > > > > > > to
> > > > > > > > be
> > > > > > > > > > > able
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > of the
> server
> > > side
> > > > > > > > > resources.
> > > > > > > > > > In
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > >> > >> particular
> > > > > > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > capacity of
> > the
> > > > > > request
> > > > > > > > > > handler
> > > > > > > > > > > > > > > threads. I
> > > > > > > > > > > > > > > >> > >> agree
> > > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > intuitive
> for
> > > the
> > > > > > users
> > > > > > > to
> > > > > > > > > > > > determine
> > > > > > > > > > > > > > how
> > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > >> > set
> > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > this is not
> > > > > completely
> > > > > > > new
> > > > > > > > > and
> > > > > > > > > > > has
> > > > > > > > > > > > > > been
> > > > > > > > > > > > > > > >> done
> > > > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > already. For
> > > > > example,
> > > > > > > > Linux
> > > > > > > > > > > > cgroup (
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> https://access.redhat.com/
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > Resource_Management_Guide/sec-
> > > > > > > > > > > > > > cpu.html)
> > > > > > > > > > > > > > > >> has
> > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > cpu.cfs_quota_us,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > which
> > specifies
> > > > the
> > > > > > > total
> > > > > > > > > > amount
> > > > > > > > > > > > of
> > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a
> > > cgroup
> > > > > can
> > > > > > > run
> > > > > > > > > > > during a
> > > > > > > > > > > > > one
> > > > > > > > > > > > > > > >> second
> > > > > > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > model the
> > > request
> > > > > > > handler
> > > > > > > > > > > threads
> > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > >> > similar
> > > > > > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > request
> > handler
> > > > > thread
> > > > > > > can
> > > > > > > > > be
> > > > > > > > > > 1
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > >> > handler
> > > > > > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure a
> > > limit
> > > > on
> > > > > > how
> > > > > > > > > many
> > > > > > > > > > > > units
> > > > > > > > > > > > > > (say
> > > > > > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > > > > > >> > >> a
> > > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Regarding
> not
> > > > > > throttling
> > > > > > > > the
> > > > > > > > > > > > > internal
> > > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > do that.
> > > > > > Alternatively,
> > > > > > > we
> > > > > > > > > > could
> > > > > > > > > > > > > just
> > > > > > > > > > > > > > > let
> > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for the
> kafka
> > > user
> > > > > (it
> > > > > > > may
> > > > > > > > > not
> > > > > > > > > > > be
> > > > > > > > > > > > > able
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> do
> > > > > > > > > > > > > > > >> > >> that
> > > > > > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we
> > want
> > > to
> > > > > be
> > > > > > > able
> > > > > > > > > to
> > > > > > > > > > > > > protect
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool too.
> The
> > > > > > difficult
> > > > > > > is
> > > > > > > > > > > mostly
> > > > > > > > > > > > > what
> > > > > > > > > > > > > > > >> Rajini
> > > > > > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttling
> the
> > > > > > requests
> > > > > > > is
> > > > > > > > > > > through
> > > > > > > > > > > > > > > >> Purgatory
> > > > > > > > > > > > > > > >> > >> and
> > > > > > > > > > > > > > > >> > >> > we
> > > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > through how
> to
> > > > > > integrate
> > > > > > > > > that
> > > > > > > > > > > into
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > network
> > > > > > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > layer,
> > currently
> > > > we
> > > > > > know
> > > > > > > > the
> > > > > > > > > > > user,
> > > > > > > > > > > > > but
> > > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit
> > > tricky
> > > > to
> > > > > > > > > throttle
> > > > > > > > > > > > based
> > > > > > > > > > > > > on
> > > > > > > > > > > > > > > >> > clientId
> > > > > > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota can
> > > already
> > > > > > > protect
> > > > > > > > > the
> > > > > > > > > > > > > network
> > > > > > > > > > > > > > > >> thread
> > > > > > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > requests.
> So,
> > if
> > > > we
> > > > > > > can't
> > > > > > > > > > figure
> > > > > > > > > > > > out
> > > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > >> > part
> > > > > > > > > > > > > > > >> > >> > right
> > > > > > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request
> > > > handling
> > > > > > > > threads
> > > > > > > > > > for
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb
> > 21,
> > > > 2017
> > > > > > at
> > > > > > > > 4:27
> > > > > > > > > > AM,
> > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank you
> > all
> > > > for
> > > > > > the
> > > > > > > > > > > feedback.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I
> have
> > > > > removed
> > > > > > > > > > exemption
> > > > > > > > > > > > for
> > > > > > > > > > > > > > > >> consumer
> > > > > > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > protecting
> > the
> > > > > > cluster
> > > > > > > > is
> > > > > > > > > > more
> > > > > > > > > > > > > > > important
> > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have
> > retained
> > > > the
> > > > > > > > > exemption
> > > > > > > > > > > for
> > > > > > > > > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled
> > only
> > > > if
> > > > > > > > > > > authorization
> > > > > > > > > > > > > > fails
> > > > > > > > > > > > > > > >> (so
> > > > > > > > > > > > > > > >> > >> can't
> > > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a secure
> > > > cluster,
> > > > > > but
> > > > > > > > > allows
> > > > > > > > > > > > > > > >> inter-broker
> > > > > > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I will
> wait
> > > > > another
> > > > > > > day
> > > > > > > > to
> > > > > > > > > > see
> > > > > > > > > > > > if
> > > > > > > > > > > > > > > these
> > > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > > >> > >> any
> > > > > > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > request
> > > > processing
> > > > > > > time
> > > > > > > > > (as
> > > > > > > > > > > > > opposed
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> > >> request
> > > > > > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> objections,
> > I
> > > > will
> > > > > > > > revert
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > > > >> original
> > > > > > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > The
> original
> > > > > > proposal
> > > > > > > > was
> > > > > > > > > > only
> > > > > > > > > > > > > > > including
> > > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handler
> > > threads
> > > > > > (that
> > > > > > > > made
> > > > > > > > > > > > > > calculation
> > > > > > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > include
> the
> > > time
> > > > > > spent
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> significant.
> > > As
> > > > > Jay
> > > > > > > > > pointed
> > > > > > > > > > > out,
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > total
> > > available
> > > > > CPU
> > > > > > > time
> > > > > > > > > and
> > > > > > > > > > > > > convert
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and *n*
> > > network
> > > > > > > threads.
> > > > > > > > > > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we want,
> but
> > > it
> > > > > can
> > > > > > be
> > > > > > > > > very
> > > > > > > > > > > > > > expensive
> > > > > > > > > > > > > > > on
> > > > > > > > > > > > > > > >> > some
> > > > > > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang
> > have
> > > > > > pointed
> > > > > > > > out,
> > > > > > > > > > we
> > > > > > > > > > > do
> > > > > > > > > > > > > > have
> > > > > > > > > > > > > > > >> > several
> > > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > generating
> > > > metrics
> > > > > > > that
> > > > > > > > we
> > > > > > > > > > > could
> > > > > > > > > > > > > > use,
> > > > > > > > > > > > > > > >> > though
> > > > > > > > > > > > > > > >> > >> we
> > > > > > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime()
> > > > instead
> > > > > > of
> > > > > > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > small
> > requests
> > > > may
> > > > > > be
> > > > > > > <
> > > > > > > > > 1ms.
> > > > > > > > > > > But
> > > > > > > > > > > > > > > rather
> > > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > > >> > >> add
> > > > > > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thread and
> > > > network
> > > > > > > > thread,
> > > > > > > > > > > > > wouldn't
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > > >> > >> better
> > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on each
> > thread
> > > > > into
> > > > > > a
> > > > > > > > > > separate
> > > > > > > > > > > > > > ratio?
> > > > > > > > > > > > > > > >> UserA
> > > > > > > > > > > > > > > >> > >> has
> > > > > > > > > > > > > > > >> > >> > a
> > > > > > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we take
> that
> > > to
> > > > > mean
> > > > > > > > that
> > > > > > > > > > > UserA
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > use
> > > > > > > > > > > > > > > >> 5%
> > > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of
> > the
> > > > time
> > > > > > on
> > > > > > > > I/O
> > > > > > > > > > > > threads?
> > > > > > > > > > > > > > If
> > > > > > > > > > > > > > > >> > either
> > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled
> -
> > it
> > > > > would
> > > > > > > > mean
> > > > > > > > > > > > > > maintaining
> > > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > > >> > >> sets
> > > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > durations,
> > but
> > > > > would
> > > > > > > > > result
> > > > > > > > > > in
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quota
> limits
> > > > > (UserA
> > > > > > > has
> > > > > > > > 5%
> > > > > > > > > > of
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > >> > threads
> > > > > > > > > > > > > > > >> > >> > and
> > > > > > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > but that
> > seems
> > > > > > > > unnecessary
> > > > > > > > > > and
> > > > > > > > > > > > > > harder
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> > >> explain
> > > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back to
> why
> > > and
> > > > > how
> > > > > > > > quotas
> > > > > > > > > > are
> > > > > > > > > > > > > > applied
> > > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > > >> > >> > network
> > > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In the
> > case
> > > > of
> > > > > > > fetch,
> > > > > > > > > > the
> > > > > > > > > > > > time
> > > > > > > > > > > > > > > >> spent in
> > > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> significant
> > > and
> > > > I
> > > > > > can
> > > > > > > > see
> > > > > > > > > > the
> > > > > > > > > > > > need
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> > include
> > > > > > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > requests
> > where
> > > > the
> > > > > > > > network
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of fetch,
> > > > request
> > > > > > > > handler
> > > > > > > > > > > thread
> > > > > > > > > > > > > > > >> > utilization
> > > > > > > > > > > > > > > >> > >> > would
> > > > > > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > high
> request
> > > > rate,
> > > > > > low
> > > > > > > > > data
> > > > > > > > > > > > volume
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > >> > fetch
> > > > > > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > clients
> with
> > > > high
> > > > > > data
> > > > > > > > > > volume.
> > > > > > > > > > > > > > Network
> > > > > > > > > > > > > > > >> > thread
> > > > > > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> proportional
> > > to
> > > > > the
> > > > > > > data
> > > > > > > > > > > > volume. I
> > > > > > > > > > > > > > am
> > > > > > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > based on
> > > network
> > > > > > > thread
> > > > > > > > > > > > > utilization
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > >> > >> whether
> > > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At the
> > > > moment,
> > > > > we
> > > > > > > > > record
> > > > > > > > > > > and
> > > > > > > > > > > > > > check
> > > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > > >> > >> quota
> > > > > > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a quota
> > is
> > > > > > > violated,
> > > > > > > > > the
> > > > > > > > > > > > > response
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk reads
> > for
> > > > > > fetches
> > > > > > > > > > > happening
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> network
> > > > > > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delay a
> > > response
> > > > > > after
> > > > > > > > the
> > > > > > > > > > > disk
> > > > > > > > > > > > > > reads.
> > > > > > > > > > > > > > > >> We
> > > > > > > > > > > > > > > >> > >> could
> > > > > > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > the
> network
> > > > thread
> > > > > > > when
> > > > > > > > > the
> > > > > > > > > > > > > response
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > >> > >> complete
> > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling a
> > > > > > subsequent
> > > > > > > > > > request
> > > > > > > > > > > > > > > (separate
> > > > > > > > > > > > > > > >> out
> > > > > > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling
> in
> > > the
> > > > > case
> > > > > > > of
> > > > > > > > > > > network
> > > > > > > > > > > > > > thread
> > > > > > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue,
> Feb
> > > 21,
> > > > > 2017
> > > > > > > at
> > > > > > > > > 2:58
> > > > > > > > > > > AM,
> > > > > > > > > > > > > > > Becket
> > > > > > > > > > > > > > > >> > Qin <
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> becket.qin@gmail.com>
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I
> > > agree
> > > > > that
> > > > > > > > > > enforcing
> > > > > > > > > > > > the
> > > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > > >> > >> is a
> > > > > > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > that
> maybe
> > > we
> > > > > can
> > > > > > > use
> > > > > > > > > the
> > > > > > > > > > > > > existing
> > > > > > > > > > > > > > > >> > request
> > > > > > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > very
> > > detailed
> > > > so
> > > > > > we
> > > > > > > > can
> > > > > > > > > > > > probably
> > > > > > > > > > > > > > see
> > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> something
> > > like
> > > > > > > > > > (total_time -
> > > > > > > > > > > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> remote_time).
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I agree
> > with
> > > > > > > Guozhang
> > > > > > > > > that
> > > > > > > > > > > > when
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > >> user is
> > > > > > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > need to
> > see
> > > if
> > > > > > > > anything
> > > > > > > > > > has
> > > > > > > > > > > > went
> > > > > > > > > > > > > > > wrong
> > > > > > > > > > > > > > > >> > >> first,
> > > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > behaving
> > and
> > > > > just
> > > > > > > need
> > > > > > > > > > more
> > > > > > > > > > > > > > > >> resources, we
> > > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for
> them.
> > It
> > > > is
> > > > > > true
> > > > > > > > > that
> > > > > > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > users is
> > > > > > difficult.
> > > > > > > So
> > > > > > > > > in
> > > > > > > > > > > > > practice
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > >> > would
> > > > > > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a
> relative
> > > > high
> > > > > > > > > protective
> > > > > > > > > > > CPU
> > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > >> quota
> > > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for some
> > > > > > individual
> > > > > > > > > > clients
> > > > > > > > > > > on
> > > > > > > > > > > > > > > demand.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie
> > > > > (Becket)
> > > > > > > Qin
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon,
> > Feb
> > > > 20,
> > > > > > 2017
> > > > > > > > at
> > > > > > > > > > 5:48
> > > > > > > > > > > > PM,
> > > > > > > > > > > > > > > >> Guozhang
> > > > > > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> wangguoz@gmail.com
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > This
> is
> > a
> > > > > great
> > > > > > > > > > proposal,
> > > > > > > > > > > > glad
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> see
> > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am
> > > > inclined
> > > > > to
> > > > > > > the
> > > > > > > > > CPU
> > > > > > > > > > > > > > > >> throttling, or
> > > > > > > > > > > > > > > >> > >> more
> > > > > > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio
> > > > instead
> > > > > of
> > > > > > > the
> > > > > > > > > > > request
> > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > summed
> > my
> > > > > > > rationales
> > > > > > > > > > > above,
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > > one
> > > > > > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > has a
> > good
> > > > > > support
> > > > > > > > for
> > > > > > > > > > > both
> > > > > > > > > > > > > > > >> "protecting
> > > > > > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > "utilizing a
> > > > > > > cluster
> > > > > > > > > for
> > > > > > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> explain
> > > this
> > > > > to
> > > > > > > the
> > > > > > > > > end
> > > > > > > > > > > > > users, I
> > > > > > > > > > > > > > > >> find
> > > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> request
> > > rate
> > > > > > since
> > > > > > > > as
> > > > > > > > > > > > > mentioned
> > > > > > > > > > > > > > > >> above,
> > > > > > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > different
> > > > > > "cost",
> > > > > > > > and
> > > > > > > > > > > Kafka
> > > > > > > > > > > > > > today
> > > > > > > > > > > > > > > >> > already
> > > > > > > > > > > > > > > >> > >> > have
> > > > > > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > (produce,
> > > > > fetch,
> > > > > > > > > admin,
> > > > > > > > > > > > > > metadata,
> > > > > > > > > > > > > > > >> etc),
> > > > > > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > throttling
> > > > may
> > > > > > not
> > > > > > > > be
> > > > > > > > > as
> > > > > > > > > > > > > > effective
> > > > > > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > conservatively.
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > Regarding
> > > to
> > > > > > user
> > > > > > > > > > > reactions
> > > > > > > > > > > > > when
> > > > > > > > > > > > > > > >> they
> > > > > > > > > > > > > > > >> > are
> > > > > > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > case-by-case,
> > > > > > and
> > > > > > > > need
> > > > > > > > > > to
> > > > > > > > > > > be
> > > > > > > > > > > > > > > >> > discovered /
> > > > > > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> metrics.
> > > So
> > > > in
> > > > > > > other
> > > > > > > > > > words
> > > > > > > > > > > > > users
> > > > > > > > > > > > > > > >> would
> > > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > information
> > > > by
> > > > > > > > simply
> > > > > > > > > > > being
> > > > > > > > > > > > > told
> > > > > > > > > > > > > > > >> "hey,
> > > > > > > > > > > > > > > >> > >> you
> > > > > > > > > > > > > > > >> > >> > are
> > > > > > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > what
> > > > > throttling
> > > > > > > > does;
> > > > > > > > > > they
> > > > > > > > > > > > > need
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > >> > take a
> > > > > > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > throttled
> > > > > > probably
> > > > > > > > > > because
> > > > > > > > > > > > of
> > > > > > > > > > > > > > ..",
> > > > > > > > > > > > > > > >> > which
> > > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > > >> > >> > by
> > > > > > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> values:
> > > e.g.
> > > > > > > whether
> > > > > > > > > I'm
> > > > > > > > > > > > > > > bombarding
> > > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > [Message clipped]
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > *Todd Palino*
> > > > > > Staff Site Reliability Engineer
> > > > > > Data Infrastructure Streaming
> > > > > >
> > > > > >
> > > > > >
> > > > > > linkedin.com/in/toddpalino
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Todd Palino*
> > > > Staff Site Reliability Engineer
> > > > Data Infrastructure Streaming
> > > >
> > > >
> > > >
> > > > linkedin.com/in/toddpalino
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Hi Jun,

Thank you for reviewing the KIP again.

30. That is a good idea. In fact, it is one of the advantages of measuring
overall utilization rather than separate values for network and I/O threads
as I had intended initially. Have updated the KIP, thanks.

31. Added exempt-request-time metric.

32. I had thought of using quota.window.size.seconds * quota.window.num
initially, but felt that would be too big. Even the default of 11 seconds
is a rather long time to be throttled. With a limit of
quota.window.size.seconds, subsequent requests for that total interval of
the samples will also each be throttled for quota.window.size.seconds if
the time recorded was very high. So limiting at quota.window.size.seconds
limits the throttle time for an individual request, avoiding timeouts where
possible, but still throttles over a period of time.

33. Updated to use request_percentage.


On Thu, Mar 9, 2017 at 5:40 PM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Rajini,
>
> Thanks for the updated KIP. A few more comments.
>
> 30. Should we just account for the time in network threads in this KIP too?
> The issue with doing this later is that existing quotas may be too small
> and everyone will have to adjust them before upgrading, which is
> inconvenient. If we just do the delaying in the io threads, there probably
> isn't too much additional work to include the network thread time?
>
> 31. It would be useful for the new metrics to capture the utilization of
> all those requests exempt from request throttling (under sth like
> "exempt"). It's useful for an admin to know how much time is spent there
> too.
>
> 32. "The maximum throttle time for any single request will be the quota
> window size (one second by default)." We probably should cap the delay at
> quota.window.size.seconds * quota.window.num?
>
> 33. It's unfortunate that we use . in configs and _ in ZK data structures.
> However, for consistency, request.percentage in ZK probably should be
> request_percentage?
>
> Thanks,
>
> Jun
>
> On Thu, Mar 9, 2017 at 7:55 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > I have updated the KIP to use "request.percentage" quotas where the
> > percentage is out of a total of (num.io.threads * 100). I have added the
> > other options considered so far under "Rejected Alternatives".
> >
> > To address Todd's concern about per-thread quotas: Even though the quotas
> > are out of (num.io.threads * 100)  clients are not locked into threads.
> > Utilization is measured as the total across all the I/O threads and 10 %
> > quota can be 1% of 10 threads. Individual quotas can also be greater than
> > 100% if required.
> >
> > Please let me know if there are any other concerns or suggestions.
> >
> > Thank you,
> >
> > Rajini
> >
> > On Wed, Mar 8, 2017 at 10:20 PM, Todd Palino <tp...@gmail.com> wrote:
> >
> > > Rajini -
> > >
> > > I understand what you’re saying, but the point I’m making is that I
> don’t
> > > believe we need to take it into account directly. The CPU utilization
> of
> > > the network threads is directly proportional to the number of bytes
> being
> > > sent. The more bytes, the more CPU that is required for SSL (or other
> > > tasks). This is opposed to the request handler threads, where there
> are a
> > > number of factors that affect CPU utilization. This means that it’s not
> > > necessary to separately quota network thread byte usage and CPU - if we
> > > quota byte usage (which we already do), we have fixed the CPU usage at
> a
> > > proportional amount.
> > >
> > > Jun -
> > >
> > > Thanks for the clarification there. I was thinking of the utilization
> > > percentage as being fixed, not what the percentage reflects. I’m not
> tied
> > > to either way of doing it, provided that we do not lock clients to a
> > single
> > > thread. For example, if I specify that a given client can use 10% of a
> > > single thread, that should also mean they can use 1% on 10 threads.
> > >
> > > -Todd
> > >
> > >
> > >
> > > On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Todd,
> > > >
> > > > Thanks for the feedback.
> > > >
> > > > I just want to clarify your second point. If the limit percentage is
> > per
> > > > thread and the thread counts are changed, the absolute processing
> limit
> > > for
> > > > existing users haven't changed and there is no need to adjust them.
> On
> > > the
> > > > other hand, if the limit percentage is of total thread pool capacity
> > and
> > > > the thread counts are changed, the effective processing limit for a
> > user
> > > > will change. So, to preserve the current processing limit, existing
> > user
> > > > limits have to be adjusted. If there is a hardware change, the
> > effective
> > > > processing limit for a user will change in either approach and the
> > > existing
> > > > limit may need to be adjusted. However, hardware changes are less
> > common
> > > > than thread pool configuration changes.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <tp...@gmail.com>
> wrote:
> > > >
> > > > > I’ve been following this one on and off, and overall it sounds good
> > to
> > > > me.
> > > > >
> > > > > - The SSL question is a good one. However, that type of overhead
> > should
> > > > be
> > > > > proportional to the bytes rate, so I think that a bytes rate quota
> > > would
> > > > > still be a suitable way to address it.
> > > > >
> > > > > - I think it’s better to make the quota percentage of total thread
> > pool
> > > > > capacity, and not percentage of an individual thread. That way you
> > > don’t
> > > > > have to adjust it when you adjust thread counts (tuning, hardware
> > > > changes,
> > > > > etc.)
> > > > >
> > > > >
> > > > > -Todd
> > > > >
> > > > >
> > > > >
> > > > > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <be...@gmail.com>
> > > wrote:
> > > > >
> > > > > > I see. Good point about SSL.
> > > > > >
> > > > > > I just asked Todd to take a look.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Hi, Jiangjie,
> > > > > > >
> > > > > > > Yes, I agree that byte rate already protects the network
> threads
> > > > > > > indirectly. I am not sure if byte rate fully captures the CPU
> > > > overhead
> > > > > in
> > > > > > > network due to SSL. So, at the high level, we can use request
> > time
> > > > > limit
> > > > > > to
> > > > > > > protect CPU and use byte rate to protect storage and network.
> > > > > > >
> > > > > > > Also, do you think you can get Todd to comment on this KIP?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <
> > becket.qin@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hi Rajini/Jun,
> > > > > > > >
> > > > > > > > The percentage based reasoning sounds good.
> > > > > > > > One thing I am wondering is that if we assume the network
> > thread
> > > > are
> > > > > > just
> > > > > > > > doing the network IO, can we say bytes rate quota is already
> > sort
> > > > of
> > > > > > > > network threads quota?
> > > > > > > > If we take network threads into the consideration here, would
> > > that
> > > > be
> > > > > > > > somewhat overlapping with the bytes rate quota?
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > > > > rajinisivaram@gmail.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Jun,
> > > > > > > > >
> > > > > > > > > Thank you for the explanation, I hadn't realized you meant
> > > > > percentage
> > > > > > > of
> > > > > > > > > the total thread pool. If everyone is OK with Jun's
> > > suggestion, I
> > > > > > will
> > > > > > > > > update the KIP.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Rajini
> > > > > > > > >
> > > > > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Rajini,
> > > > > > > > > >
> > > > > > > > > > Let's take your example. Let's say a user sets the limit
> to
> > > > 50%.
> > > > > I
> > > > > > am
> > > > > > > > not
> > > > > > > > > > sure if it's better to apply the same percentage
> separately
> > > to
> > > > > > > network
> > > > > > > > > and
> > > > > > > > > > io thread pool. For example, for produce requests, most
> of
> > > the
> > > > > time
> > > > > > > > will
> > > > > > > > > be
> > > > > > > > > > spent in the io threads whereas for fetch requests, most
> of
> > > the
> > > > > > time
> > > > > > > > will
> > > > > > > > > > be in the network threads. So, using the same percentage
> in
> > > > both
> > > > > > > thread
> > > > > > > > > > pools means one of the pools' resource will be over
> > > allocated.
> > > > > > > > > >
> > > > > > > > > > An alternative way is to simply model network and io
> thread
> > > > pool
> > > > > > > > > together.
> > > > > > > > > > If you get 10 io threads and 5 network threads, you get
> > 1500%
> > > > > > request
> > > > > > > > > > processing power. A 50% limit means a total of 750%
> > > processing
> > > > > > power.
> > > > > > > > We
> > > > > > > > > > just add up the time a user request spent in either
> network
> > > or
> > > > io
> > > > > > > > thread.
> > > > > > > > > > If that total exceeds 750% (doesn't matter whether it's
> > spent
> > > > > more
> > > > > > in
> > > > > > > > > > network or io thread), the request will be throttled.
> This
> > > > seems
> > > > > > more
> > > > > > > > > > general and is not sensitive to the current
> implementation
> > > > detail
> > > > > > of
> > > > > > > > > having
> > > > > > > > > > a separate network and io thread pool. In the future, if
> > the
> > > > > > > threading
> > > > > > > > > > model changes, the same concept of quota can still be
> > > applied.
> > > > > For
> > > > > > > now,
> > > > > > > > > > since it's a bit tricky to add the delay logic in the
> > network
> > > > > > thread
> > > > > > > > > pool,
> > > > > > > > > > we could probably just do the delaying only in the io
> > threads
> > > > as
> > > > > > you
> > > > > > > > > > suggested earlier.
> > > > > > > > > >
> > > > > > > > > > There is still the orthogonal question of whether a quota
> > of
> > > > 50%
> > > > > is
> > > > > > > out
> > > > > > > > > of
> > > > > > > > > > 100% or 100% * #total processing threads. My feeling is
> > that
> > > > the
> > > > > > > latter
> > > > > > > > > is
> > > > > > > > > > slightly better based on my explanation earlier. The way
> to
> > > > > > describe
> > > > > > > > this
> > > > > > > > > > quota to the users can be "share of elapsed request
> > > processing
> > > > > time
> > > > > > > on
> > > > > > > > a
> > > > > > > > > > single CPU" (similar to top).
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Jun,
> > > > > > > > > > >
> > > > > > > > > > > Agree about the two scenarios.
> > > > > > > > > > >
> > > > > > > > > > > But still not sure about a single quota covering both
> > > network
> > > > > > > threads
> > > > > > > > > and
> > > > > > > > > > > I/O threads with per-thread quota. If there are 10 I/O
> > > > threads
> > > > > > and
> > > > > > > 5
> > > > > > > > > > > network threads and I want to assign half the quota to
> > > userA,
> > > > > the
> > > > > > > > quota
> > > > > > > > > > > would be 750%. I imagine, internally, we would convert
> > this
> > > > to
> > > > > > 500%
> > > > > > > > for
> > > > > > > > > > I/O
> > > > > > > > > > > and 250% for network threads to allocate 50% of each
> > pool.
> > > > > > > > > > >
> > > > > > > > > > > A couple of scenarios:
> > > > > > > > > > >
> > > > > > > > > > > 1. Admin adds 1 extra network thread. To retain 50%,
> > admin
> > > > > needs
> > > > > > to
> > > > > > > > now
> > > > > > > > > > > allocate 800% for each user. Or increase the quota for
> a
> > > few
> > > > > > users.
> > > > > > > > To
> > > > > > > > > > me,
> > > > > > > > > > > it feels like admin needs to convert 50% to 800% and
> > Kafka
> > > > > > > internally
> > > > > > > > > > needs
> > > > > > > > > > > to convert 800% to (500%, 300%). Everyone using just
> 50%
> > > > feels
> > > > > a
> > > > > > > lot
> > > > > > > > > > > simpler.
> > > > > > > > > > >
> > > > > > > > > > > 2. We decide to add some other thread to this list.
> Admin
> > > > needs
> > > > > > to
> > > > > > > > know
> > > > > > > > > > > exactly how many threads form the maximum quota. And we
> > can
> > > > be
> > > > > > > > changing
> > > > > > > > > > > this between broker versions as we add more to the
> list.
> > > > Again
> > > > > a
> > > > > > > > single
> > > > > > > > > > > overall percent would be a lot simpler.
> > > > > > > > > > >
> > > > > > > > > > > There were others who were unconvinced by a single
> > percent
> > > > from
> > > > > > the
> > > > > > > > > > initial
> > > > > > > > > > > proposal and were happier with thread units similar to
> > CPU
> > > > > units,
> > > > > > > so
> > > > > > > > I
> > > > > > > > > am
> > > > > > > > > > > ok with going with per-thread quotas (as units or
> > percent).
> > > > > Just
> > > > > > > not
> > > > > > > > > sure
> > > > > > > > > > > it makes it easier for admin in all cases.
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > >
> > > > > > > > > > > Rajini
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <
> > jun@confluent.io>
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > >
> > > > > > > > > > > > Consider modeling as n * 100% unit. For 2), the
> > question
> > > is
> > > > > > > what's
> > > > > > > > > > > causing
> > > > > > > > > > > > the I/O threads to be saturated. It's unlikely that
> all
> > > > > users'
> > > > > > > > > > > utilization
> > > > > > > > > > > > have increased at the same. A more likely case is
> that
> > a
> > > > few
> > > > > > > > isolated
> > > > > > > > > > > > users' utilization have increased. If so, after
> > > increasing
> > > > > the
> > > > > > > > number
> > > > > > > > > > of
> > > > > > > > > > > > threads, the admin just needs to adjust the quota
> for a
> > > few
> > > > > > > > isolated
> > > > > > > > > > > users,
> > > > > > > > > > > > which is expected and is less work.
> > > > > > > > > > > >
> > > > > > > > > > > > Consider modeling as 1 * 100% unit. For 1), all
> users'
> > > > quota
> > > > > > need
> > > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > > adjusted, which is unexpected and is more work.
> > > > > > > > > > > >
> > > > > > > > > > > > So, to me, the n * 100% model seems more convenient.
> > > > > > > > > > > >
> > > > > > > > > > > > As for future extension to cover network thread
> > > > utilization,
> > > > > I
> > > > > > > was
> > > > > > > > > > > thinking
> > > > > > > > > > > > that one way is to simply model the capacity as (n +
> > m) *
> > > > > 100%
> > > > > > > > unit,
> > > > > > > > > > > where
> > > > > > > > > > > > n and m are the number of network and i/o threads,
> > > > > > respectively.
> > > > > > > > > Then,
> > > > > > > > > > > for
> > > > > > > > > > > > each user, we can just add up the utilization in the
> > > > network
> > > > > > and
> > > > > > > > the
> > > > > > > > > > i/o
> > > > > > > > > > > > thread. If we do this, we don't need a new type of
> > quota.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jun
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Jun,
> > > > > > > > > > > > >
> > > > > > > > > > > > > If we use request.percentage as the percentage used
> > in
> > > a
> > > > > > single
> > > > > > > > I/O
> > > > > > > > > > > > thread,
> > > > > > > > > > > > > the total percentage being allocated will be
> > > > > num.io.threads *
> > > > > > > 100
> > > > > > > > > for
> > > > > > > > > > > I/O
> > > > > > > > > > > > > threads and num.network.threads * 100 for network
> > > > threads.
> > > > > A
> > > > > > > > single
> > > > > > > > > > > quota
> > > > > > > > > > > > > covering the two as a percentage wouldn't quite
> work
> > if
> > > > you
> > > > > > > want
> > > > > > > > to
> > > > > > > > > > > > > allocate the same proportion in both cases. If we
> > want
> > > to
> > > > > > treat
> > > > > > > > > > threads
> > > > > > > > > > > > as
> > > > > > > > > > > > > separate units, won't we need two quota
> > configurations
> > > > > > > regardless
> > > > > > > > > of
> > > > > > > > > > > > > whether we use units or percentage? Perhaps I
> > > > misunderstood
> > > > > > > your
> > > > > > > > > > > > > suggestion.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think there are two cases:
> > > > > > > > > > > > >
> > > > > > > > > > > > >    1. The use case that you mentioned where an
> admin
> > is
> > > > > > adding
> > > > > > > > more
> > > > > > > > > > > users
> > > > > > > > > > > > >    and decides to add more I/O threads and expects
> to
> > > > find
> > > > > > free
> > > > > > > > > quota
> > > > > > > > > > > to
> > > > > > > > > > > > >    allocate for new users.
> > > > > > > > > > > > >    2. Admin adds more I/O threads because the I/O
> > > threads
> > > > > are
> > > > > > > > > > saturated
> > > > > > > > > > > > and
> > > > > > > > > > > > >    there are cores available to allocate, even
> though
> > > the
> > > > > > > number
> > > > > > > > or
> > > > > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > > > > >
> > > > > > > > > > > > > If we allocated treated I/O threads as a single
> unit
> > of
> > > > > 100%,
> > > > > > > all
> > > > > > > > > > user
> > > > > > > > > > > > > quotas need to be reallocated for 1). If we
> allocated
> > > I/O
> > > > > > > threads
> > > > > > > > > as
> > > > > > > > > > n
> > > > > > > > > > > > > units with n*100%, all user quotas need to be
> > > reallocated
> > > > > for
> > > > > > > 2),
> > > > > > > > > > > > otherwise
> > > > > > > > > > > > > some of the new threads may just not be used.
> Either
> > > way
> > > > it
> > > > > > > > should
> > > > > > > > > be
> > > > > > > > > > > > easy
> > > > > > > > > > > > > to write a script to decrease/increase quotas by a
> > > > multiple
> > > > > > for
> > > > > > > > all
> > > > > > > > > > > > users.
> > > > > > > > > > > > >
> > > > > > > > > > > > > So it really boils down to which quota unit is most
> > > > > intuitive
> > > > > > > in
> > > > > > > > > > terms
> > > > > > > > > > > of
> > > > > > > > > > > > > configuration. And from the discussion so far, it
> > feels
> > > > > like
> > > > > > > > > opinion
> > > > > > > > > > is
> > > > > > > > > > > > > divided on whether quotas should be carved out of
> an
> > > > > absolute
> > > > > > > > 100%
> > > > > > > > > > (or
> > > > > > > > > > > 1
> > > > > > > > > > > > > unit) or be relative to the number of threads
> (n*100%
> > > or
> > > > n
> > > > > > > > units).
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <
> > > > jun@confluent.io>
> > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Another way to express an absolute limit is to
> use
> > > > > > > > > > > request.percentage,
> > > > > > > > > > > > > but
> > > > > > > > > > > > > > treat it as the percentage used in a single
> request
> > > > > > handling
> > > > > > > > > > thread.
> > > > > > > > > > > > For
> > > > > > > > > > > > > > now, the request handling threads can be just the
> > io
> > > > > > threads.
> > > > > > > > In
> > > > > > > > > > the
> > > > > > > > > > > > > > future, they can cover the network threads as
> well.
> > > > This
> > > > > is
> > > > > > > > > similar
> > > > > > > > > > > to
> > > > > > > > > > > > > how
> > > > > > > > > > > > > > top reports CPU usage and may be a bit easier for
> > > > people
> > > > > to
> > > > > > > > > > > understand.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <
> > > > > > jun@confluent.io>
> > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. Regarding request.unit vs
> request.percentage.
> > I
> > > > > > started
> > > > > > > > with
> > > > > > > > > > > > > > > request.percentage too. The reasoning for
> > > > request.unit
> > > > > is
> > > > > > > the
> > > > > > > > > > > > > following.
> > > > > > > > > > > > > > > Suppose that the capacity has been reached on a
> > > > broker
> > > > > > and
> > > > > > > > the
> > > > > > > > > > > admin
> > > > > > > > > > > > > > needs
> > > > > > > > > > > > > > > to add a new user. A simple way to increase the
> > > > > capacity
> > > > > > is
> > > > > > > > to
> > > > > > > > > > > > increase
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > number of io threads, assuming there are still
> > > enough
> > > > > > > cores.
> > > > > > > > If
> > > > > > > > > > the
> > > > > > > > > > > > > limit
> > > > > > > > > > > > > > > is based on percentage, the additional capacity
> > > > > > > automatically
> > > > > > > > > > gets
> > > > > > > > > > > > > > > distributed to existing users and we haven't
> > really
> > > > > > carved
> > > > > > > > out
> > > > > > > > > > any
> > > > > > > > > > > > > > > additional resource for the new user. Now, is
> it
> > > easy
> > > > > > for a
> > > > > > > > > user
> > > > > > > > > > to
> > > > > > > > > > > > > > reason
> > > > > > > > > > > > > > > about 0.1 unit vs 10%. My feeling is that both
> > are
> > > > hard
> > > > > > and
> > > > > > > > > have
> > > > > > > > > > to
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > configured empirically. Not sure if percentage
> is
> > > > > > obviously
> > > > > > > > > > easier
> > > > > > > > > > > to
> > > > > > > > > > > > > > > reason about.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <
> > > > > > > jay@confluent.io
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >> 1. Even though the implementation of this
> quota
> > is
> > > > > only
> > > > > > > > using
> > > > > > > > > io
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > >> time, i think we should call it something like
> > > > > > > > "request-time".
> > > > > > > > > > > This
> > > > > > > > > > > > > will
> > > > > > > > > > > > > > >> give us flexibility to improve the
> > implementation
> > > to
> > > > > > cover
> > > > > > > > > > network
> > > > > > > > > > > > > > threads
> > > > > > > > > > > > > > >> in the future and will avoid exposing internal
> > > > details
> > > > > > > like
> > > > > > > > > our
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >> 2. Jun/Roger, I get what you are trying to fix
> > but
> > > > the
> > > > > > > idea
> > > > > > > > of
> > > > > > > > > > > > > > >> thread/units
> > > > > > > > > > > > > > >> is super unintuitive as a user-facing knob. I
> > had
> > > to
> > > > > > read
> > > > > > > > the
> > > > > > > > > > KIP
> > > > > > > > > > > > like
> > > > > > > > > > > > > > >> eight times to understand this. I'm not sure
> > that
> > > > your
> > > > > > > point
> > > > > > > > > > that
> > > > > > > > > > > > > > >> increasing the number of threads is a problem
> > > with a
> > > > > > > > > > > > percentage-based
> > > > > > > > > > > > > > >> value, it really depends on whether the user
> > > thinks
> > > > > > about
> > > > > > > > the
> > > > > > > > > > > > > > "percentage
> > > > > > > > > > > > > > >> of request processing time" or "thread units".
> > If
> > > > they
> > > > > > > think
> > > > > > > > > "I
> > > > > > > > > > > have
> > > > > > > > > > > > > > >> allocated 10% of my request processing time to
> > > user
> > > > x"
> > > > > > > then
> > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > a
> > > > > > > > > > > > > bug
> > > > > > > > > > > > > > >> that increasing the thread count decreases
> that
> > > > > percent
> > > > > > as
> > > > > > > > it
> > > > > > > > > > does
> > > > > > > > > > > > in
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> current proposal. As a practical matter I
> think
> > > the
> > > > > only
> > > > > > > way
> > > > > > > > > to
> > > > > > > > > > > > > actually
> > > > > > > > > > > > > > >> reason about this is as a percent---I just
> don't
> > > > > believe
> > > > > > > > > people
> > > > > > > > > > > are
> > > > > > > > > > > > > > going
> > > > > > > > > > > > > > >> to think, "ah, 4.3 thread units, that is the
> > right
> > > > > > > amount!".
> > > > > > > > > > > > Instead I
> > > > > > > > > > > > > > >> think they have to understand this thread unit
> > > > > concept,
> > > > > > > > figure
> > > > > > > > > > out
> > > > > > > > > > > > > what
> > > > > > > > > > > > > > >> they have set in number of threads, compute a
> > > > percent
> > > > > > and
> > > > > > > > then
> > > > > > > > > > > come
> > > > > > > > > > > > up
> > > > > > > > > > > > > > >> with
> > > > > > > > > > > > > > >> the number of thread units, and these will all
> > be
> > > > > wrong
> > > > > > if
> > > > > > > > > that
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > >> count changes. I also think this ties us to
> > > > throttling
> > > > > > the
> > > > > > > > I/O
> > > > > > > > > > > > thread
> > > > > > > > > > > > > > >> pool,
> > > > > > > > > > > > > > >> which may not be where we want to end up.
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >> 3. For what it's worth I do think having a
> > single
> > > > > > > > throttle_ms
> > > > > > > > > > > field
> > > > > > > > > > > > in
> > > > > > > > > > > > > > all
> > > > > > > > > > > > > > >> the responses that combines all throttling
> from
> > > all
> > > > > > quotas
> > > > > > > > is
> > > > > > > > > > > > probably
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> simplest. There could be a use case for having
> > > > > separate
> > > > > > > > fields
> > > > > > > > > > for
> > > > > > > > > > > > > each,
> > > > > > > > > > > > > > >> but I think that is actually harder to
> > use/monitor
> > > > in
> > > > > > the
> > > > > > > > > common
> > > > > > > > > > > > case
> > > > > > > > > > > > > so
> > > > > > > > > > > > > > >> unless someone has a use case I think just one
> > > > should
> > > > > be
> > > > > > > > fine.
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini
> Sivaram
> > <
> > > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > >> > I have updated the KIP based on the
> > discussions
> > > so
> > > > > > far.
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > Regards,
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > Rajini
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini
> > > Sivaram <
> > > > > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > > Ismael #1. It makes sense not to throttle
> > > > > > inter-broker
> > > > > > > > > > > requests
> > > > > > > > > > > > > like
> > > > > > > > > > > > > > >> > > LeaderAndIsr etc. The simplest way to
> ensure
> > > > that
> > > > > > > > clients
> > > > > > > > > > > cannot
> > > > > > > > > > > > > use
> > > > > > > > > > > > > > >> > these
> > > > > > > > > > > > > > >> > > requests to bypass quotas for DoS attacks
> is
> > > to
> > > > > > ensure
> > > > > > > > > that
> > > > > > > > > > > ACLs
> > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > >> > > clients from using these requests and
> > > > unauthorized
> > > > > > > > > requests
> > > > > > > > > > > are
> > > > > > > > > > > > > > >> included
> > > > > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking that
> > these
> > > > > quotas
> > > > > > > can
> > > > > > > > > > > return
> > > > > > > > > > > > a
> > > > > > > > > > > > > > >> > separate
> > > > > > > > > > > > > > >> > > throttle time, and all utilization based
> > > quotas
> > > > > > could
> > > > > > > > use
> > > > > > > > > > the
> > > > > > > > > > > > same
> > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > >> > > (we won't add another one for network
> thread
> > > > > > > utilization
> > > > > > > > > for
> > > > > > > > > > > > > > >> instance).
> > > > > > > > > > > > > > >> > But
> > > > > > > > > > > > > > >> > > perhaps it makes sense to keep byte rate
> > > quotas
> > > > > > > separate
> > > > > > > > > in
> > > > > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > > > > >> > > responses to provide separate metrics?
> Agree
> > > > with
> > > > > > > Ismael
> > > > > > > > > > that
> > > > > > > > > > > > the
> > > > > > > > > > > > > > >> name of
> > > > > > > > > > > > > > >> > > the existing field should be changed if we
> > > have
> > > > > two.
> > > > > > > > Happy
> > > > > > > > > > to
> > > > > > > > > > > > > switch
> > > > > > > > > > > > > > >> to a
> > > > > > > > > > > > > > >> > > single combined throttle time if that is
> > > > > sufficient.
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will
> use
> > > dot
> > > > > > > > separated
> > > > > > > > > > > name
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > >> new
> > > > > > > > > > > > > > >> > > property. Replication quotas use dot
> > > separated,
> > > > so
> > > > > > it
> > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > > > > > >> consistent
> > > > > > > > > > > > > > >> > > with all properties except byte rate
> quotas.
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > > Radai: #1 Request processing time rather
> > than
> > > > > > request
> > > > > > > > rate
> > > > > > > > > > > were
> > > > > > > > > > > > > > chosen
> > > > > > > > > > > > > > >> > > because the time per request can vary
> > > > > significantly
> > > > > > > > > between
> > > > > > > > > > > > > requests
> > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > > > > > > > >> > > #2 Two separate quotas for
> > heartbeats/regular
> > > > > > requests
> > > > > > > > > feel
> > > > > > > > > > > like
> > > > > > > > > > > > > > more
> > > > > > > > > > > > > > >> > > configuration and more metrics. Since most
> > > users
> > > > > > would
> > > > > > > > set
> > > > > > > > > > > > quotas
> > > > > > > > > > > > > > >> higher
> > > > > > > > > > > > > > >> > > than the expected usage and quotas are
> more
> > > of a
> > > > > > > safety
> > > > > > > > > > net, a
> > > > > > > > > > > > > > single
> > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > > > > >> > >  #3 The number of requests in purgatory is
> > > > limited
> > > > > > by
> > > > > > > > the
> > > > > > > > > > > number
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > >> > active
> > > > > > > > > > > > > > >> > > connections since only one request per
> > > > connection
> > > > > > will
> > > > > > > > be
> > > > > > > > > > > > > throttled
> > > > > > > > > > > > > > >> at a
> > > > > > > > > > > > > > >> > > time.
> > > > > > > > > > > > > > >> > > #4 As with byte rate quotas, to use the
> full
> > > > > > allocated
> > > > > > > > > > quotas,
> > > > > > > > > > > > > > >> > > clients/users would need to use partitions
> > > that
> > > > > are
> > > > > > > > > > > distributed
> > > > > > > > > > > > > > across
> > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > >> > > cluster. The alternative of using
> > cluster-wide
> > > > > > quotas
> > > > > > > > > > instead
> > > > > > > > > > > of
> > > > > > > > > > > > > > >> > per-broker
> > > > > > > > > > > > > > >> > > quotas would be far too complex to
> > implement.
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > > Dong : We currently have two
> > > ClientQuotaManagers
> > > > > for
> > > > > > > > quota
> > > > > > > > > > > types
> > > > > > > > > > > > > > Fetch
> > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > >> > > Produce. A new one will be added for
> > IOThread,
> > > > > which
> > > > > > > > > manages
> > > > > > > > > > > > > quotas
> > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > >> > I/O
> > > > > > > > > > > > > > >> > > thread utilization. This will not update
> the
> > > > Fetch
> > > > > > or
> > > > > > > > > > Produce
> > > > > > > > > > > > > > >> queue-size,
> > > > > > > > > > > > > > >> > > but will have a separate metric for the
> > > > > > queue-size.  I
> > > > > > > > > > wasn't
> > > > > > > > > > > > > > >> planning to
> > > > > > > > > > > > > > >> > > add any additional metrics apart from the
> > > > > equivalent
> > > > > > > > ones
> > > > > > > > > > for
> > > > > > > > > > > > > > existing
> > > > > > > > > > > > > > >> > > quotas as part of this KIP. Ratio of
> > byte-rate
> > > > to
> > > > > > I/O
> > > > > > > > > thread
> > > > > > > > > > > > > > >> utilization
> > > > > > > > > > > > > > >> > > could be slightly misleading since it
> > depends
> > > on
> > > > > the
> > > > > > > > > > sequence
> > > > > > > > > > > of
> > > > > > > > > > > > > > >> > requests.
> > > > > > > > > > > > > > >> > > But we can look into more metrics after
> the
> > > KIP
> > > > is
> > > > > > > > > > implemented
> > > > > > > > > > > > if
> > > > > > > > > > > > > > >> > required.
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > > I think we need to limit the maximum delay
> > > since
> > > > > all
> > > > > > > > > > requests
> > > > > > > > > > > > are
> > > > > > > > > > > > > > >> > > throttled. If a client has a quota of
> 0.001
> > > > units
> > > > > > and
> > > > > > > a
> > > > > > > > > > single
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > >> > used
> > > > > > > > > > > > > > >> > > 50ms, we don't want to delay all requests
> > from
> > > > the
> > > > > > > > client
> > > > > > > > > by
> > > > > > > > > > > 50
> > > > > > > > > > > > > > >> seconds,
> > > > > > > > > > > > > > >> > > throwing the client out of all its
> consumer
> > > > > groups.
> > > > > > > The
> > > > > > > > > > issue
> > > > > > > > > > > is
> > > > > > > > > > > > > > only
> > > > > > > > > > > > > > >> if
> > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > >> > > user is allocated a quota that is
> > insufficient
> > > > to
> > > > > > > > process
> > > > > > > > > > one
> > > > > > > > > > > > > large
> > > > > > > > > > > > > > >> > > request. The expectation is that the units
> > > > > allocated
> > > > > > > per
> > > > > > > > > > user
> > > > > > > > > > > > will
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > >> > much
> > > > > > > > > > > > > > >> > > higher than the time taken to process one
> > > > request
> > > > > > and
> > > > > > > > the
> > > > > > > > > > > limit
> > > > > > > > > > > > > > should
> > > > > > > > > > > > > > >> > > seldom be applied. Agree this needs proper
> > > > > > > > documentation.
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > > >> > >> @jun: i wasnt concerned about tying up a
> > > > request
> > > > > > > > > processing
> > > > > > > > > > > > > thread,
> > > > > > > > > > > > > > >> but
> > > > > > > > > > > > > > >> > >> IIUC the code does still read the entire
> > > > request
> > > > > > out,
> > > > > > > > > which
> > > > > > > > > > > > might
> > > > > > > > > > > > > > >> add-up
> > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong
> Lin
> > <
> > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > >> > >> > The current KIP says that the maximum
> > delay
> > > > > will
> > > > > > be
> > > > > > > > > > reduced
> > > > > > > > > > > > to
> > > > > > > > > > > > > > >> window
> > > > > > > > > > > > > > >> > >> size
> > > > > > > > > > > > > > >> > >> > if it is larger than the window size. I
> > > have
> > > > a
> > > > > > > > concern
> > > > > > > > > > with
> > > > > > > > > > > > > this:
> > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > >> > >> > 1) This essentially means that the user
> > is
> > > > > > allowed
> > > > > > > to
> > > > > > > > > > > exceed
> > > > > > > > > > > > > > their
> > > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > > >> > >> > over a long period of time. Can you
> > provide
> > > > an
> > > > > > > upper
> > > > > > > > > > bound
> > > > > > > > > > > on
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > >> > >> > 2) What is the motivation for cap the
> > > maximum
> > > > > > delay
> > > > > > > > by
> > > > > > > > > > the
> > > > > > > > > > > > > window
> > > > > > > > > > > > > > >> > size?
> > > > > > > > > > > > > > >> > >> I
> > > > > > > > > > > > > > >> > >> > am wondering if there is better
> > alternative
> > > > to
> > > > > > > > address
> > > > > > > > > > the
> > > > > > > > > > > > > > problem.
> > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > >> > >> > 3) It means that the existing
> > > metric-related
> > > > > > config
> > > > > > > > > will
> > > > > > > > > > > > have a
> > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > >> > >> > directly impact on the mechanism of
> this
> > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > >> quota.
> > > > > > > > > > > > > > >> > The
> > > > > > > > > > > > > > >> > >> > may be an important change depending on
> > the
> > > > > > answer
> > > > > > > to
> > > > > > > > > 1)
> > > > > > > > > > > > above.
> > > > > > > > > > > > > > We
> > > > > > > > > > > > > > >> > >> probably
> > > > > > > > > > > > > > >> > >> > need to document this more explicitly.
> > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong
> > Lin
> > > <
> > > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > >> > >> > > Yeah you are right. I thought it
> wasn't
> > > > > because
> > > > > > > at
> > > > > > > > > > > LinkedIn
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > >> will
> > > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > >> > >> > > much pressure on inGraph to expose
> > those
> > > > > > > > per-clientId
> > > > > > > > > > > > metrics
> > > > > > > > > > > > > > so
> > > > > > > > > > > > > > >> we
> > > > > > > > > > > > > > >> > >> ended
> > > > > > > > > > > > > > >> > >> > > up printing them periodically to
> local
> > > log.
> > > > > > Never
> > > > > > > > > mind
> > > > > > > > > > if
> > > > > > > > > > > > it
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > >> not
> > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > >> > >> > > - I agree with Jay that we probably
> > don't
> > > > > want
> > > > > > to
> > > > > > > > > add a
> > > > > > > > > > > new
> > > > > > > > > > > > > > field
> > > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > > >> > >> > > every quota ProduceResponse or
> > > > FetchResponse.
> > > > > > Is
> > > > > > > > > there
> > > > > > > > > > > any
> > > > > > > > > > > > > > >> use-case
> > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > >> > >> > > having separate throttle-time fields
> > for
> > > > > > > > > > byte-rate-quota
> > > > > > > > > > > > and
> > > > > > > > > > > > > > >> > >> > > io-thread-unit-quota? You probably
> need
> > > to
> > > > > > > document
> > > > > > > > > > this
> > > > > > > > > > > as
> > > > > > > > > > > > > > >> > interface
> > > > > > > > > > > > > > >> > >> > > change if you plan to add new field
> in
> > > any
> > > > > > > request.
> > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > >> > >> > > - I don't think IOThread belongs to
> > > > > quotaType.
> > > > > > > The
> > > > > > > > > > > existing
> > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > >> > >> types
> > > > > > > > > > > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > > > > > > > > > > n/FollowerReplication)
> > > > > > > > > > > > > > >> identify
> > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > >> > >> > > type of request that are throttled,
> not
> > > the
> > > > > > quota
> > > > > > > > > > > mechanism
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > >> > >> > > - If a request is throttled due to
> this
> > > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > > >> quota,
> > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > >> > >> > > existing queue-size metric in
> > > > > > ClientQuotaManager
> > > > > > > > > > > > incremented?
> > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > >> > >> > > - In the interest of providing guide
> > line
> > > > for
> > > > > > > admin
> > > > > > > > > to
> > > > > > > > > > > > decide
> > > > > > > > > > > > > > >> > >> > > io-thread-unit-based quota and for
> user
> > > to
> > > > > > > > understand
> > > > > > > > > > its
> > > > > > > > > > > > > > impact
> > > > > > > > > > > > > > >> on
> > > > > > > > > > > > > > >> > >> their
> > > > > > > > > > > > > > >> > >> > > traffic, would it be useful to have a
> > > > metric
> > > > > > that
> > > > > > > > > shows
> > > > > > > > > > > the
> > > > > > > > > > > > > > >> overall
> > > > > > > > > > > > > > >> > >> > > byte-rate per io-thread-unit? Can we
> > also
> > > > > show
> > > > > > > > this a
> > > > > > > > > > > > > > >> per-clientId
> > > > > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun
> > Rao
> > > <
> > > > > > > > > > > jun@confluent.io
> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > >> > >> > >> For #3, typically, an admin won't
> > > > configure
> > > > > > more
> > > > > > > > io
> > > > > > > > > > > > threads
> > > > > > > > > > > > > > than
> > > > > > > > > > > > > > >> > CPU
> > > > > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > > > > >> > >> > >> but it's possible for an admin to
> > start
> > > > with
> > > > > > > fewer
> > > > > > > > > io
> > > > > > > > > > > > > threads
> > > > > > > > > > > > > > >> than
> > > > > > > > > > > > > > >> > >> cores
> > > > > > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > >> > >> > >> I think the throttleTime sensor on
> the
> > > > > broker
> > > > > > > > tells
> > > > > > > > > > the
> > > > > > > > > > > > > admin
> > > > > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > >> > >> > >> The reasoning for delaying the
> > throttled
> > > > > > > requests
> > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > > > > > broker
> > > > > > > > > > > > > > >> > >> instead
> > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > >> > >> > >> returning an error immediately is
> that
> > > the
> > > > > > > latter
> > > > > > > > > has
> > > > > > > > > > no
> > > > > > > > > > > > way
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > >> > >> > >> client from retrying immediately,
> > which
> > > > will
> > > > > > > make
> > > > > > > > > > things
> > > > > > > > > > > > > > worse.
> > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > >> > >> > >> delaying logic is based off a delay
> > > > queue. A
> > > > > > > > > separate
> > > > > > > > > > > > > > expiration
> > > > > > > > > > > > > > >> > >> thread
> > > > > > > > > > > > > > >> > >> > >> just waits on the next to be expired
> > > > > request.
> > > > > > > So,
> > > > > > > > it
> > > > > > > > > > > > doesn't
> > > > > > > > > > > > > > tie
> > > > > > > > > > > > > > >> > up a
> > > > > > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM,
> > Ismael
> > > > > Juma <
> > > > > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > > > > >> >
> > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > >> > >> > >> > Regarding 1, I definitely like the
> > > > > > simplicity
> > > > > > > of
> > > > > > > > > > > > keeping a
> > > > > > > > > > > > > > >> single
> > > > > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > > > > >> > >> > >> > time field in the response. The
> > > downside
> > > > > is
> > > > > > > that
> > > > > > > > > the
> > > > > > > > > > > > > client
> > > > > > > > > > > > > > >> > metrics
> > > > > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > > > > percentage`
> > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.
> ratio`.
> > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM,
> Jay
> > > > > Kreps <
> > > > > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > >> > >> > >> > >    1. Isn't it the case that the
> > > > > > throttling
> > > > > > > > time
> > > > > > > > > > > > > response
> > > > > > > > > > > > > > >> field
> > > > > > > > > > > > > > >> > >> > should
> > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > >> > >> > >> > >    the total time your request
> was
> > > > > > throttled
> > > > > > > > > > > > > irrespective
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > > > > >> > >> > >> > >    caused that. Limiting it to
> > byte
> > > > rate
> > > > > > > quota
> > > > > > > > > > > doesn't
> > > > > > > > > > > > > > make
> > > > > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > >> > >> > >> > >    I don't think we want to end
> up
> > > > > adding
> > > > > > > new
> > > > > > > > > > fields
> > > > > > > > > > > > in
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> response
> > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > > > > > > > > > > >> > >> > >> > >    2. I don't think we should
> make
> > > > this
> > > > > > > quota
> > > > > > > > > > > > > specifically
> > > > > > > > > > > > > > >> > about
> > > > > > > > > > > > > > >> > >> io
> > > > > > > > > > > > > > >> > >> > >> > >    threads. Once we introduce
> > these
> > > > > quotas
> > > > > > > > > people
> > > > > > > > > > > set
> > > > > > > > > > > > > them
> > > > > > > > > > > > > > >> and
> > > > > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > >> > >> > >> > >    be enforced (and if they
> aren't
> > > it
> > > > > may
> > > > > > > > cause
> > > > > > > > > an
> > > > > > > > > > > > > > outage).
> > > > > > > > > > > > > > >> As
> > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > > > > >> > >> > >> > >    are a bit more sensitive than
> > > > normal
> > > > > > > > > configs, I
> > > > > > > > > > > > > think.
> > > > > > > > > > > > > > >> The
> > > > > > > > > > > > > > >> > >> > current
> > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > >> > >> > >> > >    pools seem like something of
> an
> > > > > > > > > implementation
> > > > > > > > > > > > detail
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > >> > >> > >> > >    user-facing quotas should be
> > > > involved
> > > > > > > > with. I
> > > > > > > > > > > think
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > >> might
> > > > > > > > > > > > > > >> > >> be
> > > > > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > >> > >> > >> > >    make this a general
> > request-time
> > > > > > throttle
> > > > > > > > > with
> > > > > > > > > > no
> > > > > > > > > > > > > > >> mention in
> > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > > > > >> > >> > >> > >    about I/O threads and simply
> > > > > > acknowledge
> > > > > > > > the
> > > > > > > > > > > > current
> > > > > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > > > > >> > >> > >> > >    may someday fix) in the docs
> > that
> > > > > this
> > > > > > > > covers
> > > > > > > > > > > only
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > >> > >> after
> > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > >> > >> > >> > >    thread is read off the
> network.
> > > > > > > > > > > > > > >> > >> > >> > >    3. As such I think the right
> > > > > interface
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > user
> > > > > > > > > > > > > > would
> > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > > > > >> > >> > >> > >    like percent_request_time and
> > be
> > > in
> > > > > > > > > {0,...100}
> > > > > > > > > > or
> > > > > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think
> > "ratio"
> > > > is
> > > > > > the
> > > > > > > > > > > > terminology
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > >> used
> > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > > > > >> > >> > >> > >    is between 0 and 1 in the
> other
> > > > > > metrics,
> > > > > > > > > > right?)
> > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM,
> > > > Rajini
> > > > > > > > Sivaram
> > > > > > > > > <
> > > > > > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > >> > >> > >> > > > Guozhang : I have updated the
> > > > section
> > > > > on
> > > > > > > > > > > > co-existence
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > >> byte
> > > > > > > > > > > > > > >> > >> rate
> > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added much
> detail
> > > to
> > > > > the
> > > > > > > > > metrics
> > > > > > > > > > > and
> > > > > > > > > > > > > > >> sensors
> > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > > > > >> > >> > >> > > > going to be very similar to
> the
> > > > > existing
> > > > > > > > > metrics
> > > > > > > > > > > and
> > > > > > > > > > > > > > >> sensors.
> > > > > > > > > > > > > > >> > >> To
> > > > > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > > > > >> > >> > >> > > > confusion, I have now added
> more
> > > > > detail.
> > > > > > > All
> > > > > > > > > > > metrics
> > > > > > > > > > > > > are
> > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > > > > >> > >> > >> > > > "quotaType" and all sensors
> have
> > > > names
> > > > > > > > > starting
> > > > > > > > > > > with
> > > > > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/
> > > > > > > > LeaderReplication/
> > > > > > > > > > > > > > >> > >> > >> > > > FollowerReplication/*IOThread*
> > ).
> > > > > > > > > > > > > > >> > >> > >> > > > So there will be no reuse of
> > > > existing
> > > > > > > > > > > > metrics/sensors.
> > > > > > > > > > > > > > The
> > > > > > > > > > > > > > >> > new
> > > > > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > > >> > >> > >> > > > request processing time based
> > > > > throttling
> > > > > > > > will
> > > > > > > > > be
> > > > > > > > > > > > > > >> completely
> > > > > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > >> > >> > >> > > > existing metrics/sensors, but
> > will
> > > > be
> > > > > > > > > consistent
> > > > > > > > > > > in
> > > > > > > > > > > > > > >> format.
> > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > >> > >> > >> > > > The existing throttle_time_ms
> > > field
> > > > in
> > > > > > > > > > > produce/fetch
> > > > > > > > > > > > > > >> > responses
> > > > > > > > > > > > > > >> > >> > will
> > > > > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > >> > >> > >> > > > impacted by this KIP. That
> will
> > > > > continue
> > > > > > > to
> > > > > > > > > > return
> > > > > > > > > > > > > > >> byte-rate
> > > > > > > > > > > > > > >> > >> based
> > > > > > > > > > > > > > >> > >> > >> > > > throttling times. In
> addition, a
> > > new
> > > > > > field
> > > > > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > >> > >> > >> > > > added to return request quota
> > > based
> > > > > > > > throttling
> > > > > > > > > > > > times.
> > > > > > > > > > > > > > >> These
> > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > > > > >> > >> > >> > > > as new metrics on the
> > client-side.
> > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > >> > >> > >> > > > Since all metrics and sensors
> > are
> > > > > > > different
> > > > > > > > > for
> > > > > > > > > > > each
> > > > > > > > > > > > > > type
> > > > > > > > > > > > > > >> of
> > > > > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > > > > >> > >> > >> > > > believe there is already
> > > sufficient
> > > > > > > metrics
> > > > > > > > to
> > > > > > > > > > > > monitor
> > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > > > > >> > >> > >> > > > client and broker side for
> each
> > > type
> > > > > of
> > > > > > > > > > > throttling.
> > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32
> AM,
> > > > Dong
> > > > > > Lin
> > > > > > > <
> > > > > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > I think it makes a lot of
> > sense
> > > to
> > > > > use
> > > > > > > > > > > > > io_thread_units
> > > > > > > > > > > > > > >> as
> > > > > > > > > > > > > > >> > >> metric
> > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > >> > >> > >> > > > > user's traffic here. LGTM
> > > > overall. I
> > > > > > > have
> > > > > > > > > some
> > > > > > > > > > > > > > questions
> > > > > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > - Can you be more specific
> in
> > > the
> > > > > KIP
> > > > > > > what
> > > > > > > > > > > sensors
> > > > > > > > > > > > > > will
> > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > >> > >> > >> > > > > example, it will be useful
> to
> > > > > specify
> > > > > > > the
> > > > > > > > > name
> > > > > > > > > > > and
> > > > > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > - We currently have
> > > throttle-time
> > > > > and
> > > > > > > > > > queue-size
> > > > > > > > > > > > for
> > > > > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > > > > >> > >> > >> > > > > Are you going to have
> separate
> > > > > > > > throttle-time
> > > > > > > > > > and
> > > > > > > > > > > > > > >> queue-size
> > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > > > > >> > >> > >> > > > > throttled by
> > > io_thread_unit-based
> > > > > > quota,
> > > > > > > > or
> > > > > > > > > > will
> > > > > > > > > > > > > they
> > > > > > > > > > > > > > >> share
> > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > - Does the throttle-time in
> > the
> > > > > > > > > > ProduceResponse
> > > > > > > > > > > > and
> > > > > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > > > > >> > >> > >> > > > > time due to
> > io_thread_unit-based
> > > > > > quota?
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > - Currently kafka server
> > doesn't
> > > > not
> > > > > > > > provide
> > > > > > > > > > any
> > > > > > > > > > > > log
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > > > > >> > >> > >> > > > > whether any given clientId
> (or
> > > > user)
> > > > > > is
> > > > > > > > > > > throttled.
> > > > > > > > > > > > > > This
> > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > > > > >> > >> > >> > > > > because we can still check
> the
> > > > > > > client-side
> > > > > > > > > > > > byte-rate
> > > > > > > > > > > > > > >> metric
> > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > > > > >> > >> > >> > > > > whether a given client is
> > > > throttled.
> > > > > > But
> > > > > > > > > with
> > > > > > > > > > > this
> > > > > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > > > > >> > >> > >> > > > > will be no way to validate
> > > > whether a
> > > > > > > given
> > > > > > > > > > > client
> > > > > > > > > > > > is
> > > > > > > > > > > > > > >> slow
> > > > > > > > > > > > > > >> > >> > because
> > > > > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > > > > >> > >> > >> > > > > exceeded its io_thread_unit
> > > limit.
> > > > > It
> > > > > > is
> > > > > > > > > > > necessary
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > >> user
> > > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > know this information to
> > figure
> > > > how
> > > > > > > > whether
> > > > > > > > > > they
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > >> > reached
> > > > > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > >> > >> > >> > > > > limit. How about we add
> log4j
> > > log
> > > > on
> > > > > > the
> > > > > > > > > > server
> > > > > > > > > > > > side
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > > > > > byte-rate-throttle-time,
> > > > > > > > > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > > > > >> > >> > >> > > > > that kafka administrator can
> > > > figure
> > > > > > > those
> > > > > > > > > > users
> > > > > > > > > > > > that
> > > > > > > > > > > > > > >> have
> > > > > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46
> > PM,
> > > > > > > Guozhang
> > > > > > > > > > Wang <
> > > > > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > Made a pass over the doc,
> > > > overall
> > > > > > LGTM
> > > > > > > > > > except
> > > > > > > > > > > a
> > > > > > > > > > > > > > minor
> > > > > > > > > > > > > > >> > >> comment
> > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > Stated as "Request
> > processing
> > > > time
> > > > > > > > > > throttling
> > > > > > > > > > > > will
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > > > > >> > >> > >> > > > > > necessary." I thought that
> > it
> > > > > meant
> > > > > > > the
> > > > > > > > > > > request
> > > > > > > > > > > > > > >> > processing
> > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > > > > >> > >> > >> > > > > > is applied first, but
> > continue
> > > > > > > reading I
> > > > > > > > > > found
> > > > > > > > > > > > it
> > > > > > > > > > > > > > >> > actually
> > > > > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > > > > >> > >> > >> > > > > > produce / fetch byte rate
> > > > > throttling
> > > > > > > > > first.
> > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > Also the last sentence
> "The
> > > > > > remaining
> > > > > > > > > delay
> > > > > > > > > > if
> > > > > > > > > > > > any
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > response." is a bit
> > confusing
> > > to
> > > > > me.
> > > > > > > > Maybe
> > > > > > > > > > > > > rewording
> > > > > > > > > > > > > > >> it a
> > > > > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at
> 3:24
> > > PM,
> > > > > Jun
> > > > > > > > Rao <
> > > > > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > Thanks for the updated
> > KIP.
> > > > The
> > > > > > > latest
> > > > > > > > > > > > proposal
> > > > > > > > > > > > > > >> looks
> > > > > > > > > > > > > > >> > >> good
> > > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at
> > 2:19
> > > > PM,
> > > > > > > > Rajini
> > > > > > > > > > > > Sivaram
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > Thank you for the
> > > feedback.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > 1. I have updated the
> > KIP
> > > to
> > > > > use
> > > > > > > > > > absolute
> > > > > > > > > > > > > units
> > > > > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > property is called*
> > > > > > > io_thread_units*
> > > > > > > > > to
> > > > > > > > > > > > align
> > > > > > > > > > > > > > with
> > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> > *num.io.threads*.
> > > > > When
> > > > > > we
> > > > > > > > > > > implement
> > > > > > > > > > > > > > >> network
> > > > > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas, we can add
> > another
> > > > > > > property
> > > > > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown
> is
> > > > > already
> > > > > > > > > listed
> > > > > > > > > > > > under
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > you mean a different
> > > request
> > > > > > that
> > > > > > > > > needs
> > > > > > > > > > to
> > > > > > > > > > > > be
> > > > > > > > > > > > > > >> added?
> > > > > > > > > > > > > > >> > >> The
> > > > > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > currently exempt in
> the
> > > KIP
> > > > > are
> > > > > > > > > > > StopReplica,
> > > > > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and
> > > > > UpdateMetadata.
> > > > > > > > These
> > > > > > > > > > are
> > > > > > > > > > > > > > >> controlled
> > > > > > > > > > > > > > >> > >> > using
> > > > > > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy to
> > > > exclude
> > > > > > and
> > > > > > > > only
> > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > sure if there are
> other
> > > > > requests
> > > > > > > > used
> > > > > > > > > > only
> > > > > > > > > > > > for
> > > > > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was thinking the
> > > > smallest
> > > > > > > > change
> > > > > > > > > > > would
> > > > > > > > > > > > be
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> > >> replace
> > > > > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > *requestChannel.sendResponse()
> > > > > *
> > > > > > > > with
> > > > > > > > > a
> > > > > > > > > > > > local
> > > > > > > > > > > > > > >> method
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > *sendResponseMaybeThrottle()*
> > > > > > that
> > > > > > > > > does
> > > > > > > > > > > the
> > > > > > > > > > > > > > >> > throttling
> > > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > response. If we
> throttle
> > > > first
> > > > > > in
> > > > > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > within the method
> > handling
> > > > the
> > > > > > > > request
> > > > > > > > > > > will
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > be
> > > > > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > throttling. We can
> look
> > > into
> > > > > > this
> > > > > > > > > again
> > > > > > > > > > > when
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> PR
> > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017
> at
> > > 5:55
> > > > > PM,
> > > > > > > > Roger
> > > > > > > > > > > > Hoover
> > > > > > > > > > > > > <
> > > > > > > > > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Great to see this
> KIP
> > > and
> > > > > the
> > > > > > > > > > excellent
> > > > > > > > > > > > > > >> discussion.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's
> > suggestion
> > > > > makes
> > > > > > > > sense.
> > > > > > > > > > If
> > > > > > > > > > > > my
> > > > > > > > > > > > > > >> > >> application
> > > > > > > > > > > > > > >> > >> > is
> > > > > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request handler
> unit,
> > > then
> > > > > > it's
> > > > > > > as
> > > > > > > > > if
> > > > > > > > > > I
> > > > > > > > > > > > > have a
> > > > > > > > > > > > > > >> > Kafka
> > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request handler
> thread
> > > > > > dedicated
> > > > > > > > to
> > > > > > > > > > me.
> > > > > > > > > > > > > > That's
> > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > least.  That
> > allocation
> > > > > > doesn't
> > > > > > > > > change
> > > > > > > > > > > > even
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > >> an
> > > > > > > > > > > > > > >> > >> admin
> > > > > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > size of the request
> > > thread
> > > > > > pool
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > > > > broker.
> > > > > > > > > > > > > > >> > It's
> > > > > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > abstraction that VMs
> > and
> > > > > > > > containers
> > > > > > > > > > get
> > > > > > > > > > > > from
> > > > > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > While different
> client
> > > > > access
> > > > > > > > > patterns
> > > > > > > > > > > can
> > > > > > > > > > > > > use
> > > > > > > > > > > > > > >> > wildly
> > > > > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request thread
> > resources
> > > > per
> > > > > > > > > request,
> > > > > > > > > > a
> > > > > > > > > > > > > given
> > > > > > > > > > > > > > >> > >> > application
> > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > have a stable access
> > > > pattern
> > > > > > and
> > > > > > > > can
> > > > > > > > > > > > figure
> > > > > > > > > > > > > > out
> > > > > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > "request thread
> units"
> > > it
> > > > > > needs
> > > > > > > to
> > > > > > > > > > meet
> > > > > > > > > > > > it's
> > > > > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017
> > at
> > > > 8:53
> > > > > > AM,
> > > > > > > > Jun
> > > > > > > > > > > Rao <
> > > > > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for the
> > updated
> > > > > KIP.
> > > > > > A
> > > > > > > > few
> > > > > > > > > > more
> > > > > > > > > > > > > > >> comments.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> > > > > > > > > request_time_percent
> > > > > > > > > > > is
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > >> it's
> > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > >> > >> > an
> > > > > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Let's say you
> give a
> > > > user
> > > > > a
> > > > > > > 10%
> > > > > > > > > > limit.
> > > > > > > > > > > > If
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> > admin
> > > > > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request handler
> > > threads,
> > > > > > that
> > > > > > > > user
> > > > > > > > > > now
> > > > > > > > > > > > > > >> actually
> > > > > > > > > > > > > > >> > has
> > > > > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > capacity. This may
> > > > confuse
> > > > > > > > people
> > > > > > > > > a
> > > > > > > > > > > bit.
> > > > > > > > > > > > > So,
> > > > > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > based on an
> absolute
> > > > > request
> > > > > > > > > thread
> > > > > > > > > > > unit
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > >> > better.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > > > > ControlledShutdownRequest
> > > > > > > is
> > > > > > > > > also
> > > > > > > > > > > an
> > > > > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be excluded from
> > > > > throttling.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 3. Implementation
> > > wise,
> > > > I
> > > > > am
> > > > > > > > > > wondering
> > > > > > > > > > > > if
> > > > > > > > > > > > > > it's
> > > > > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > time throttling
> > first
> > > in
> > > > > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the throttling
> logic
> > > in
> > > > > each
> > > > > > > > type
> > > > > > > > > of
> > > > > > > > > > > > > > request.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22,
> 2017
> > > at
> > > > > 5:58
> > > > > > > AM,
> > > > > > > > > > > Rajini
> > > > > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > > > > >> > >> > >> > > > > > > >
> rajinisivaram@gmail.com
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank you for
> the
> > > > > review.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I have reverted
> to
> > > the
> > > > > > > > original
> > > > > > > > > > KIP
> > > > > > > > > > > > that
> > > > > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > utilization. At
> > the
> > > > > > moment,
> > > > > > > it
> > > > > > > > > > uses
> > > > > > > > > > > > > > >> percentage,
> > > > > > > > > > > > > > >> > >> but
> > > > > > > > > > > > > > >> > >> > I
> > > > > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > a fraction (out
> > of 1
> > > > > > instead
> > > > > > > > of
> > > > > > > > > > 100)
> > > > > > > > > > > > if
> > > > > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > from this
> > discussion
> > > > to
> > > > > > the
> > > > > > > > KIP.
> > > > > > > > > > > Also
> > > > > > > > > > > > > > added
> > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > address network
> > > thread
> > > > > > > > > > utilization.
> > > > > > > > > > > > The
> > > > > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > "request_time_percent"
> > > > > > with
> > > > > > > > the
> > > > > > > > > > > > > > expectation
> > > > > > > > > > > > > > >> > that
> > > > > > > > > > > > > > >> > >> it
> > > > > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit for
> network
> > > > thread
> > > > > > > > > > utilization
> > > > > > > > > > > > > when
> > > > > > > > > > > > > > >> that
> > > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > users have to
> set
> > > only
> > > > > one
> > > > > > > > > config
> > > > > > > > > > > for
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > > >> > >> > not
> > > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the internal
> > > > > distribution
> > > > > > of
> > > > > > > > the
> > > > > > > > > > > work
> > > > > > > > > > > > > > >> between
> > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > >> > >> > two
> > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22,
> > 2017
> > > > at
> > > > > > > 12:23
> > > > > > > > > AM,
> > > > > > > > > > > Jun
> > > > > > > > > > > > > Rao
> > > > > > > > > > > > > > <
> > > > > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for the
> > > > > proposal.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > The benefit of
> > > using
> > > > > the
> > > > > > > > > request
> > > > > > > > > > > > > > >> processing
> > > > > > > > > > > > > > >> > >> time
> > > > > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > exactly what
> > > people
> > > > > have
> > > > > > > > > said. I
> > > > > > > > > > > > will
> > > > > > > > > > > > > > just
> > > > > > > > > > > > > > >> > >> expand
> > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > following
> case.
> > > The
> > > > > > > producer
> > > > > > > > > > > sends a
> > > > > > > > > > > > > > >> produce
> > > > > > > > > > > > > > >> > >> > request
> > > > > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > but compressed
> > to
> > > > > 100KB
> > > > > > > with
> > > > > > > > > > gzip.
> > > > > > > > > > > > The
> > > > > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > broker could
> > take
> > > > > 10-15
> > > > > > > > > seconds,
> > > > > > > > > > > > > during
> > > > > > > > > > > > > > >> which
> > > > > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > thread is
> > > completely
> > > > > > > > blocked.
> > > > > > > > > In
> > > > > > > > > > > > this
> > > > > > > > > > > > > > >> case,
> > > > > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request
> rate
> > > > quota
> > > > > > may
> > > > > > > > be
> > > > > > > > > > > > > effective
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > another case.
> A
> > > > > consumer
> > > > > > > > group
> > > > > > > > > > > > starts
> > > > > > > > > > > > > > >> with 10
> > > > > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > switches to 20
> > > > > > instances.
> > > > > > > > The
> > > > > > > > > > > > request
> > > > > > > > > > > > > > rate
> > > > > > > > > > > > > > >> > will
> > > > > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > actually load
> on
> > > the
> > > > > > > broker
> > > > > > > > > may
> > > > > > > > > > > not
> > > > > > > > > > > > > > double
> > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > contains half
> of
> > > the
> > > > > > > > > partitions.
> > > > > > > > > > > > > Request
> > > > > > > > > > > > > > >> rate
> > > > > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure in
> > this
> > > > > case.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > What we really
> > > want
> > > > is
> > > > > > to
> > > > > > > be
> > > > > > > > > > able
> > > > > > > > > > > to
> > > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > of the server
> > side
> > > > > > > > resources.
> > > > > > > > > In
> > > > > > > > > > > > this
> > > > > > > > > > > > > > >> > >> particular
> > > > > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > capacity of
> the
> > > > > request
> > > > > > > > > handler
> > > > > > > > > > > > > > threads. I
> > > > > > > > > > > > > > >> > >> agree
> > > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > intuitive for
> > the
> > > > > users
> > > > > > to
> > > > > > > > > > > determine
> > > > > > > > > > > > > how
> > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > >> > set
> > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > this is not
> > > > completely
> > > > > > new
> > > > > > > > and
> > > > > > > > > > has
> > > > > > > > > > > > > been
> > > > > > > > > > > > > > >> done
> > > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > already. For
> > > > example,
> > > > > > > Linux
> > > > > > > > > > > cgroup (
> > > > > > > > > > > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > Resource_Management_Guide/sec-
> > > > > > > > > > > > > cpu.html)
> > > > > > > > > > > > > > >> has
> > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > cpu.cfs_quota_us,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > which
> specifies
> > > the
> > > > > > total
> > > > > > > > > amount
> > > > > > > > > > > of
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > >> in
> > > > > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a
> > cgroup
> > > > can
> > > > > > run
> > > > > > > > > > during a
> > > > > > > > > > > > one
> > > > > > > > > > > > > > >> second
> > > > > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > model the
> > request
> > > > > > handler
> > > > > > > > > > threads
> > > > > > > > > > > > in a
> > > > > > > > > > > > > > >> > similar
> > > > > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > request
> handler
> > > > thread
> > > > > > can
> > > > > > > > be
> > > > > > > > > 1
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > >> > handler
> > > > > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure a
> > limit
> > > on
> > > > > how
> > > > > > > > many
> > > > > > > > > > > units
> > > > > > > > > > > > > (say
> > > > > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > > > > >> > >> a
> > > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Regarding not
> > > > > throttling
> > > > > > > the
> > > > > > > > > > > > internal
> > > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > do that.
> > > > > Alternatively,
> > > > > > we
> > > > > > > > > could
> > > > > > > > > > > > just
> > > > > > > > > > > > > > let
> > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for the kafka
> > user
> > > > (it
> > > > > > may
> > > > > > > > not
> > > > > > > > > > be
> > > > > > > > > > > > able
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> do
> > > > > > > > > > > > > > >> > >> that
> > > > > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we
> want
> > to
> > > > be
> > > > > > able
> > > > > > > > to
> > > > > > > > > > > > protect
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool too. The
> > > > > difficult
> > > > > > is
> > > > > > > > > > mostly
> > > > > > > > > > > > what
> > > > > > > > > > > > > > >> Rajini
> > > > > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttling the
> > > > > requests
> > > > > > is
> > > > > > > > > > through
> > > > > > > > > > > > > > >> Purgatory
> > > > > > > > > > > > > > >> > >> and
> > > > > > > > > > > > > > >> > >> > we
> > > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > through how to
> > > > > integrate
> > > > > > > > that
> > > > > > > > > > into
> > > > > > > > > > > > the
> > > > > > > > > > > > > > >> > network
> > > > > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > layer,
> currently
> > > we
> > > > > know
> > > > > > > the
> > > > > > > > > > user,
> > > > > > > > > > > > but
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit
> > tricky
> > > to
> > > > > > > > throttle
> > > > > > > > > > > based
> > > > > > > > > > > > on
> > > > > > > > > > > > > > >> > clientId
> > > > > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota can
> > already
> > > > > > protect
> > > > > > > > the
> > > > > > > > > > > > network
> > > > > > > > > > > > > > >> thread
> > > > > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > requests. So,
> if
> > > we
> > > > > > can't
> > > > > > > > > figure
> > > > > > > > > > > out
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > >> > part
> > > > > > > > > > > > > > >> > >> > right
> > > > > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request
> > > handling
> > > > > > > threads
> > > > > > > > > for
> > > > > > > > > > > > this
> > > > > > > > > > > > > > KIP
> > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb
> 21,
> > > 2017
> > > > > at
> > > > > > > 4:27
> > > > > > > > > AM,
> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank you
> all
> > > for
> > > > > the
> > > > > > > > > > feedback.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have
> > > > removed
> > > > > > > > > exemption
> > > > > > > > > > > for
> > > > > > > > > > > > > > >> consumer
> > > > > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > protecting
> the
> > > > > cluster
> > > > > > > is
> > > > > > > > > more
> > > > > > > > > > > > > > important
> > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have
> retained
> > > the
> > > > > > > > exemption
> > > > > > > > > > for
> > > > > > > > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled
> only
> > > if
> > > > > > > > > > authorization
> > > > > > > > > > > > > fails
> > > > > > > > > > > > > > >> (so
> > > > > > > > > > > > > > >> > >> can't
> > > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a secure
> > > cluster,
> > > > > but
> > > > > > > > allows
> > > > > > > > > > > > > > >> inter-broker
> > > > > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I will wait
> > > > another
> > > > > > day
> > > > > > > to
> > > > > > > > > see
> > > > > > > > > > > if
> > > > > > > > > > > > > > these
> > > > > > > > > > > > > > >> is
> > > > > > > > > > > > > > >> > >> any
> > > > > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > request
> > > processing
> > > > > > time
> > > > > > > > (as
> > > > > > > > > > > > opposed
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> > >> request
> > > > > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > objections,
> I
> > > will
> > > > > > > revert
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > > > >> original
> > > > > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > The original
> > > > > proposal
> > > > > > > was
> > > > > > > > > only
> > > > > > > > > > > > > > including
> > > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handler
> > threads
> > > > > (that
> > > > > > > made
> > > > > > > > > > > > > calculation
> > > > > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > include the
> > time
> > > > > spent
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > > network
> > > > > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant.
> > As
> > > > Jay
> > > > > > > > pointed
> > > > > > > > > > out,
> > > > > > > > > > > > it
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > >> more
> > > > > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > total
> > available
> > > > CPU
> > > > > > time
> > > > > > > > and
> > > > > > > > > > > > convert
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> a
> > > > > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and *n*
> > network
> > > > > > threads.
> > > > > > > > > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we want, but
> > it
> > > > can
> > > > > be
> > > > > > > > very
> > > > > > > > > > > > > expensive
> > > > > > > > > > > > > > on
> > > > > > > > > > > > > > >> > some
> > > > > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang
> have
> > > > > pointed
> > > > > > > out,
> > > > > > > > > we
> > > > > > > > > > do
> > > > > > > > > > > > > have
> > > > > > > > > > > > > > >> > several
> > > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > generating
> > > metrics
> > > > > > that
> > > > > > > we
> > > > > > > > > > could
> > > > > > > > > > > > > use,
> > > > > > > > > > > > > > >> > though
> > > > > > > > > > > > > > >> > >> we
> > > > > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime()
> > > instead
> > > > > of
> > > > > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > small
> requests
> > > may
> > > > > be
> > > > > > <
> > > > > > > > 1ms.
> > > > > > > > > > But
> > > > > > > > > > > > > > rather
> > > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > > >> > >> add
> > > > > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thread and
> > > network
> > > > > > > thread,
> > > > > > > > > > > > wouldn't
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > >> be
> > > > > > > > > > > > > > >> > >> better
> > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on each
> thread
> > > > into
> > > > > a
> > > > > > > > > separate
> > > > > > > > > > > > > ratio?
> > > > > > > > > > > > > > >> UserA
> > > > > > > > > > > > > > >> > >> has
> > > > > > > > > > > > > > >> > >> > a
> > > > > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we take that
> > to
> > > > mean
> > > > > > > that
> > > > > > > > > > UserA
> > > > > > > > > > > > can
> > > > > > > > > > > > > > use
> > > > > > > > > > > > > > >> 5%
> > > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of
> the
> > > time
> > > > > on
> > > > > > > I/O
> > > > > > > > > > > threads?
> > > > > > > > > > > > > If
> > > > > > > > > > > > > > >> > either
> > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled -
> it
> > > > would
> > > > > > > mean
> > > > > > > > > > > > > maintaining
> > > > > > > > > > > > > > >> two
> > > > > > > > > > > > > > >> > >> sets
> > > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > durations,
> but
> > > > would
> > > > > > > > result
> > > > > > > > > in
> > > > > > > > > > > > more
> > > > > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quota limits
> > > > (UserA
> > > > > > has
> > > > > > > 5%
> > > > > > > > > of
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > >> > threads
> > > > > > > > > > > > > > >> > >> > and
> > > > > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > but that
> seems
> > > > > > > unnecessary
> > > > > > > > > and
> > > > > > > > > > > > > harder
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> > >> explain
> > > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back to why
> > and
> > > > how
> > > > > > > quotas
> > > > > > > > > are
> > > > > > > > > > > > > applied
> > > > > > > > > > > > > > >> to
> > > > > > > > > > > > > > >> > >> > network
> > > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In the
> case
> > > of
> > > > > > fetch,
> > > > > > > > > the
> > > > > > > > > > > time
> > > > > > > > > > > > > > >> spent in
> > > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant
> > and
> > > I
> > > > > can
> > > > > > > see
> > > > > > > > > the
> > > > > > > > > > > need
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> > include
> > > > > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > requests
> where
> > > the
> > > > > > > network
> > > > > > > > > > > thread
> > > > > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of fetch,
> > > request
> > > > > > > handler
> > > > > > > > > > thread
> > > > > > > > > > > > > > >> > utilization
> > > > > > > > > > > > > > >> > >> > would
> > > > > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > high request
> > > rate,
> > > > > low
> > > > > > > > data
> > > > > > > > > > > volume
> > > > > > > > > > > > > and
> > > > > > > > > > > > > > >> > fetch
> > > > > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > clients with
> > > high
> > > > > data
> > > > > > > > > volume.
> > > > > > > > > > > > > Network
> > > > > > > > > > > > > > >> > thread
> > > > > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > proportional
> > to
> > > > the
> > > > > > data
> > > > > > > > > > > volume. I
> > > > > > > > > > > > > am
> > > > > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > based on
> > network
> > > > > > thread
> > > > > > > > > > > > utilization
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > >> > >> whether
> > > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At the
> > > moment,
> > > > we
> > > > > > > > record
> > > > > > > > > > and
> > > > > > > > > > > > > check
> > > > > > > > > > > > > > >> for
> > > > > > > > > > > > > > >> > >> quota
> > > > > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a quota
> is
> > > > > > violated,
> > > > > > > > the
> > > > > > > > > > > > response
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk reads
> for
> > > > > fetches
> > > > > > > > > > happening
> > > > > > > > > > > > in
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> network
> > > > > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delay a
> > response
> > > > > after
> > > > > > > the
> > > > > > > > > > disk
> > > > > > > > > > > > > reads.
> > > > > > > > > > > > > > >> We
> > > > > > > > > > > > > > >> > >> could
> > > > > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > the network
> > > thread
> > > > > > when
> > > > > > > > the
> > > > > > > > > > > > response
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > >> > >> complete
> > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling a
> > > > > subsequent
> > > > > > > > > request
> > > > > > > > > > > > > > (separate
> > > > > > > > > > > > > > >> out
> > > > > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling in
> > the
> > > > case
> > > > > > of
> > > > > > > > > > network
> > > > > > > > > > > > > thread
> > > > > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb
> > 21,
> > > > 2017
> > > > > > at
> > > > > > > > 2:58
> > > > > > > > > > AM,
> > > > > > > > > > > > > > Becket
> > > > > > > > > > > > > > >> > Qin <
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I
> > agree
> > > > that
> > > > > > > > > enforcing
> > > > > > > > > > > the
> > > > > > > > > > > > > CPU
> > > > > > > > > > > > > > >> time
> > > > > > > > > > > > > > >> > >> is a
> > > > > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > that maybe
> > we
> > > > can
> > > > > > use
> > > > > > > > the
> > > > > > > > > > > > existing
> > > > > > > > > > > > > > >> > request
> > > > > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > very
> > detailed
> > > so
> > > > > we
> > > > > > > can
> > > > > > > > > > > probably
> > > > > > > > > > > > > see
> > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > something
> > like
> > > > > > > > > (total_time -
> > > > > > > > > > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I agree
> with
> > > > > > Guozhang
> > > > > > > > that
> > > > > > > > > > > when
> > > > > > > > > > > > a
> > > > > > > > > > > > > > >> user is
> > > > > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > need to
> see
> > if
> > > > > > > anything
> > > > > > > > > has
> > > > > > > > > > > went
> > > > > > > > > > > > > > wrong
> > > > > > > > > > > > > > >> > >> first,
> > > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > behaving
> and
> > > > just
> > > > > > need
> > > > > > > > > more
> > > > > > > > > > > > > > >> resources, we
> > > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for them.
> It
> > > is
> > > > > true
> > > > > > > > that
> > > > > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > users is
> > > > > difficult.
> > > > > > So
> > > > > > > > in
> > > > > > > > > > > > practice
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > >> > would
> > > > > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a relative
> > > high
> > > > > > > > protective
> > > > > > > > > > CPU
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > >> quota
> > > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for some
> > > > > individual
> > > > > > > > > clients
> > > > > > > > > > on
> > > > > > > > > > > > > > demand.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie
> > > > (Becket)
> > > > > > Qin
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon,
> Feb
> > > 20,
> > > > > 2017
> > > > > > > at
> > > > > > > > > 5:48
> > > > > > > > > > > PM,
> > > > > > > > > > > > > > >> Guozhang
> > > > > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > This is
> a
> > > > great
> > > > > > > > > proposal,
> > > > > > > > > > > glad
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> see
> > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am
> > > inclined
> > > > to
> > > > > > the
> > > > > > > > CPU
> > > > > > > > > > > > > > >> throttling, or
> > > > > > > > > > > > > > >> > >> more
> > > > > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio
> > > instead
> > > > of
> > > > > > the
> > > > > > > > > > request
> > > > > > > > > > > > > rate
> > > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > summed
> my
> > > > > > rationales
> > > > > > > > > > above,
> > > > > > > > > > > > and
> > > > > > > > > > > > > > one
> > > > > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > has a
> good
> > > > > support
> > > > > > > for
> > > > > > > > > > both
> > > > > > > > > > > > > > >> "protecting
> > > > > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > "utilizing a
> > > > > > cluster
> > > > > > > > for
> > > > > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > explain
> > this
> > > > to
> > > > > > the
> > > > > > > > end
> > > > > > > > > > > > users, I
> > > > > > > > > > > > > > >> find
> > > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > request
> > rate
> > > > > since
> > > > > > > as
> > > > > > > > > > > > mentioned
> > > > > > > > > > > > > > >> above,
> > > > > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> different
> > > > > "cost",
> > > > > > > and
> > > > > > > > > > Kafka
> > > > > > > > > > > > > today
> > > > > > > > > > > > > > >> > already
> > > > > > > > > > > > > > >> > >> > have
> > > > > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> (produce,
> > > > fetch,
> > > > > > > > admin,
> > > > > > > > > > > > > metadata,
> > > > > > > > > > > > > > >> etc),
> > > > > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> throttling
> > > may
> > > > > not
> > > > > > > be
> > > > > > > > as
> > > > > > > > > > > > > effective
> > > > > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > conservatively.
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> Regarding
> > to
> > > > > user
> > > > > > > > > > reactions
> > > > > > > > > > > > when
> > > > > > > > > > > > > > >> they
> > > > > > > > > > > > > > >> > are
> > > > > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > case-by-case,
> > > > > and
> > > > > > > need
> > > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > > > >> > discovered /
> > > > > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > metrics.
> > So
> > > in
> > > > > > other
> > > > > > > > > words
> > > > > > > > > > > > users
> > > > > > > > > > > > > > >> would
> > > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > information
> > > by
> > > > > > > simply
> > > > > > > > > > being
> > > > > > > > > > > > told
> > > > > > > > > > > > > > >> "hey,
> > > > > > > > > > > > > > >> > >> you
> > > > > > > > > > > > > > >> > >> > are
> > > > > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > what
> > > > throttling
> > > > > > > does;
> > > > > > > > > they
> > > > > > > > > > > > need
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > >> > take a
> > > > > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> throttled
> > > > > probably
> > > > > > > > > because
> > > > > > > > > > > of
> > > > > > > > > > > > > ..",
> > > > > > > > > > > > > > >> > which
> > > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > > >> > >> > by
> > > > > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > values:
> > e.g.
> > > > > > whether
> > > > > > > > I'm
> > > > > > > > > > > > > > bombarding
> > > > > > > > > > > > > > >> the
> > > > > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > > >>
> > > > > > > > > > > > > > > ...
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > [Message clipped]
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > *Todd Palino*
> > > > > Staff Site Reliability Engineer
> > > > > Data Infrastructure Streaming
> > > > >
> > > > >
> > > > >
> > > > > linkedin.com/in/toddpalino
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Todd Palino*
> > > Staff Site Reliability Engineer
> > > Data Infrastructure Streaming
> > >
> > >
> > >
> > > linkedin.com/in/toddpalino
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Rajini,

Thanks for the updated KIP. A few more comments.

30. Should we just account for the time in network threads in this KIP too?
The issue with doing this later is that existing quotas may be too small
and everyone will have to adjust them before upgrading, which is
inconvenient. If we just do the delaying in the io threads, there probably
isn't too much additional work to include the network thread time?

31. It would be useful for the new metrics to capture the utilization of
all those requests exempt from request throttling (under sth like
"exempt"). It's useful for an admin to know how much time is spent there
too.

32. "The maximum throttle time for any single request will be the quota
window size (one second by default)." We probably should cap the delay at
quota.window.size.seconds * quota.window.num?

33. It's unfortunate that we use . in configs and _ in ZK data structures.
However, for consistency, request.percentage in ZK probably should be
request_percentage?

Thanks,

Jun

On Thu, Mar 9, 2017 at 7:55 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> I have updated the KIP to use "request.percentage" quotas where the
> percentage is out of a total of (num.io.threads * 100). I have added the
> other options considered so far under "Rejected Alternatives".
>
> To address Todd's concern about per-thread quotas: Even though the quotas
> are out of (num.io.threads * 100)  clients are not locked into threads.
> Utilization is measured as the total across all the I/O threads and 10 %
> quota can be 1% of 10 threads. Individual quotas can also be greater than
> 100% if required.
>
> Please let me know if there are any other concerns or suggestions.
>
> Thank you,
>
> Rajini
>
> On Wed, Mar 8, 2017 at 10:20 PM, Todd Palino <tp...@gmail.com> wrote:
>
> > Rajini -
> >
> > I understand what you’re saying, but the point I’m making is that I don’t
> > believe we need to take it into account directly. The CPU utilization of
> > the network threads is directly proportional to the number of bytes being
> > sent. The more bytes, the more CPU that is required for SSL (or other
> > tasks). This is opposed to the request handler threads, where there are a
> > number of factors that affect CPU utilization. This means that it’s not
> > necessary to separately quota network thread byte usage and CPU - if we
> > quota byte usage (which we already do), we have fixed the CPU usage at a
> > proportional amount.
> >
> > Jun -
> >
> > Thanks for the clarification there. I was thinking of the utilization
> > percentage as being fixed, not what the percentage reflects. I’m not tied
> > to either way of doing it, provided that we do not lock clients to a
> single
> > thread. For example, if I specify that a given client can use 10% of a
> > single thread, that should also mean they can use 1% on 10 threads.
> >
> > -Todd
> >
> >
> >
> > On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Todd,
> > >
> > > Thanks for the feedback.
> > >
> > > I just want to clarify your second point. If the limit percentage is
> per
> > > thread and the thread counts are changed, the absolute processing limit
> > for
> > > existing users haven't changed and there is no need to adjust them. On
> > the
> > > other hand, if the limit percentage is of total thread pool capacity
> and
> > > the thread counts are changed, the effective processing limit for a
> user
> > > will change. So, to preserve the current processing limit, existing
> user
> > > limits have to be adjusted. If there is a hardware change, the
> effective
> > > processing limit for a user will change in either approach and the
> > existing
> > > limit may need to be adjusted. However, hardware changes are less
> common
> > > than thread pool configuration changes.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <tp...@gmail.com> wrote:
> > >
> > > > I’ve been following this one on and off, and overall it sounds good
> to
> > > me.
> > > >
> > > > - The SSL question is a good one. However, that type of overhead
> should
> > > be
> > > > proportional to the bytes rate, so I think that a bytes rate quota
> > would
> > > > still be a suitable way to address it.
> > > >
> > > > - I think it’s better to make the quota percentage of total thread
> pool
> > > > capacity, and not percentage of an individual thread. That way you
> > don’t
> > > > have to adjust it when you adjust thread counts (tuning, hardware
> > > changes,
> > > > etc.)
> > > >
> > > >
> > > > -Todd
> > > >
> > > >
> > > >
> > > > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <be...@gmail.com>
> > wrote:
> > > >
> > > > > I see. Good point about SSL.
> > > > >
> > > > > I just asked Todd to take a look.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Jiangjie,
> > > > > >
> > > > > > Yes, I agree that byte rate already protects the network threads
> > > > > > indirectly. I am not sure if byte rate fully captures the CPU
> > > overhead
> > > > in
> > > > > > network due to SSL. So, at the high level, we can use request
> time
> > > > limit
> > > > > to
> > > > > > protect CPU and use byte rate to protect storage and network.
> > > > > >
> > > > > > Also, do you think you can get Todd to comment on this KIP?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <
> becket.qin@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hi Rajini/Jun,
> > > > > > >
> > > > > > > The percentage based reasoning sounds good.
> > > > > > > One thing I am wondering is that if we assume the network
> thread
> > > are
> > > > > just
> > > > > > > doing the network IO, can we say bytes rate quota is already
> sort
> > > of
> > > > > > > network threads quota?
> > > > > > > If we take network threads into the consideration here, would
> > that
> > > be
> > > > > > > somewhat overlapping with the bytes rate quota?
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > > > rajinisivaram@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Jun,
> > > > > > > >
> > > > > > > > Thank you for the explanation, I hadn't realized you meant
> > > > percentage
> > > > > > of
> > > > > > > > the total thread pool. If everyone is OK with Jun's
> > suggestion, I
> > > > > will
> > > > > > > > update the KIP.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Rajini
> > > > > > > >
> > > > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io>
> > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Rajini,
> > > > > > > > >
> > > > > > > > > Let's take your example. Let's say a user sets the limit to
> > > 50%.
> > > > I
> > > > > am
> > > > > > > not
> > > > > > > > > sure if it's better to apply the same percentage separately
> > to
> > > > > > network
> > > > > > > > and
> > > > > > > > > io thread pool. For example, for produce requests, most of
> > the
> > > > time
> > > > > > > will
> > > > > > > > be
> > > > > > > > > spent in the io threads whereas for fetch requests, most of
> > the
> > > > > time
> > > > > > > will
> > > > > > > > > be in the network threads. So, using the same percentage in
> > > both
> > > > > > thread
> > > > > > > > > pools means one of the pools' resource will be over
> > allocated.
> > > > > > > > >
> > > > > > > > > An alternative way is to simply model network and io thread
> > > pool
> > > > > > > > together.
> > > > > > > > > If you get 10 io threads and 5 network threads, you get
> 1500%
> > > > > request
> > > > > > > > > processing power. A 50% limit means a total of 750%
> > processing
> > > > > power.
> > > > > > > We
> > > > > > > > > just add up the time a user request spent in either network
> > or
> > > io
> > > > > > > thread.
> > > > > > > > > If that total exceeds 750% (doesn't matter whether it's
> spent
> > > > more
> > > > > in
> > > > > > > > > network or io thread), the request will be throttled. This
> > > seems
> > > > > more
> > > > > > > > > general and is not sensitive to the current implementation
> > > detail
> > > > > of
> > > > > > > > having
> > > > > > > > > a separate network and io thread pool. In the future, if
> the
> > > > > > threading
> > > > > > > > > model changes, the same concept of quota can still be
> > applied.
> > > > For
> > > > > > now,
> > > > > > > > > since it's a bit tricky to add the delay logic in the
> network
> > > > > thread
> > > > > > > > pool,
> > > > > > > > > we could probably just do the delaying only in the io
> threads
> > > as
> > > > > you
> > > > > > > > > suggested earlier.
> > > > > > > > >
> > > > > > > > > There is still the orthogonal question of whether a quota
> of
> > > 50%
> > > > is
> > > > > > out
> > > > > > > > of
> > > > > > > > > 100% or 100% * #total processing threads. My feeling is
> that
> > > the
> > > > > > latter
> > > > > > > > is
> > > > > > > > > slightly better based on my explanation earlier. The way to
> > > > > describe
> > > > > > > this
> > > > > > > > > quota to the users can be "share of elapsed request
> > processing
> > > > time
> > > > > > on
> > > > > > > a
> > > > > > > > > single CPU" (similar to top).
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Jun,
> > > > > > > > > >
> > > > > > > > > > Agree about the two scenarios.
> > > > > > > > > >
> > > > > > > > > > But still not sure about a single quota covering both
> > network
> > > > > > threads
> > > > > > > > and
> > > > > > > > > > I/O threads with per-thread quota. If there are 10 I/O
> > > threads
> > > > > and
> > > > > > 5
> > > > > > > > > > network threads and I want to assign half the quota to
> > userA,
> > > > the
> > > > > > > quota
> > > > > > > > > > would be 750%. I imagine, internally, we would convert
> this
> > > to
> > > > > 500%
> > > > > > > for
> > > > > > > > > I/O
> > > > > > > > > > and 250% for network threads to allocate 50% of each
> pool.
> > > > > > > > > >
> > > > > > > > > > A couple of scenarios:
> > > > > > > > > >
> > > > > > > > > > 1. Admin adds 1 extra network thread. To retain 50%,
> admin
> > > > needs
> > > > > to
> > > > > > > now
> > > > > > > > > > allocate 800% for each user. Or increase the quota for a
> > few
> > > > > users.
> > > > > > > To
> > > > > > > > > me,
> > > > > > > > > > it feels like admin needs to convert 50% to 800% and
> Kafka
> > > > > > internally
> > > > > > > > > needs
> > > > > > > > > > to convert 800% to (500%, 300%). Everyone using just 50%
> > > feels
> > > > a
> > > > > > lot
> > > > > > > > > > simpler.
> > > > > > > > > >
> > > > > > > > > > 2. We decide to add some other thread to this list. Admin
> > > needs
> > > > > to
> > > > > > > know
> > > > > > > > > > exactly how many threads form the maximum quota. And we
> can
> > > be
> > > > > > > changing
> > > > > > > > > > this between broker versions as we add more to the list.
> > > Again
> > > > a
> > > > > > > single
> > > > > > > > > > overall percent would be a lot simpler.
> > > > > > > > > >
> > > > > > > > > > There were others who were unconvinced by a single
> percent
> > > from
> > > > > the
> > > > > > > > > initial
> > > > > > > > > > proposal and were happier with thread units similar to
> CPU
> > > > units,
> > > > > > so
> > > > > > > I
> > > > > > > > am
> > > > > > > > > > ok with going with per-thread quotas (as units or
> percent).
> > > > Just
> > > > > > not
> > > > > > > > sure
> > > > > > > > > > it makes it easier for admin in all cases.
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > >
> > > > > > > > > > Rajini
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <
> jun@confluent.io>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > >
> > > > > > > > > > > Consider modeling as n * 100% unit. For 2), the
> question
> > is
> > > > > > what's
> > > > > > > > > > causing
> > > > > > > > > > > the I/O threads to be saturated. It's unlikely that all
> > > > users'
> > > > > > > > > > utilization
> > > > > > > > > > > have increased at the same. A more likely case is that
> a
> > > few
> > > > > > > isolated
> > > > > > > > > > > users' utilization have increased. If so, after
> > increasing
> > > > the
> > > > > > > number
> > > > > > > > > of
> > > > > > > > > > > threads, the admin just needs to adjust the quota for a
> > few
> > > > > > > isolated
> > > > > > > > > > users,
> > > > > > > > > > > which is expected and is less work.
> > > > > > > > > > >
> > > > > > > > > > > Consider modeling as 1 * 100% unit. For 1), all users'
> > > quota
> > > > > need
> > > > > > > to
> > > > > > > > be
> > > > > > > > > > > adjusted, which is unexpected and is more work.
> > > > > > > > > > >
> > > > > > > > > > > So, to me, the n * 100% model seems more convenient.
> > > > > > > > > > >
> > > > > > > > > > > As for future extension to cover network thread
> > > utilization,
> > > > I
> > > > > > was
> > > > > > > > > > thinking
> > > > > > > > > > > that one way is to simply model the capacity as (n +
> m) *
> > > > 100%
> > > > > > > unit,
> > > > > > > > > > where
> > > > > > > > > > > n and m are the number of network and i/o threads,
> > > > > respectively.
> > > > > > > > Then,
> > > > > > > > > > for
> > > > > > > > > > > each user, we can just add up the utilization in the
> > > network
> > > > > and
> > > > > > > the
> > > > > > > > > i/o
> > > > > > > > > > > thread. If we do this, we don't need a new type of
> quota.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Jun,
> > > > > > > > > > > >
> > > > > > > > > > > > If we use request.percentage as the percentage used
> in
> > a
> > > > > single
> > > > > > > I/O
> > > > > > > > > > > thread,
> > > > > > > > > > > > the total percentage being allocated will be
> > > > num.io.threads *
> > > > > > 100
> > > > > > > > for
> > > > > > > > > > I/O
> > > > > > > > > > > > threads and num.network.threads * 100 for network
> > > threads.
> > > > A
> > > > > > > single
> > > > > > > > > > quota
> > > > > > > > > > > > covering the two as a percentage wouldn't quite work
> if
> > > you
> > > > > > want
> > > > > > > to
> > > > > > > > > > > > allocate the same proportion in both cases. If we
> want
> > to
> > > > > treat
> > > > > > > > > threads
> > > > > > > > > > > as
> > > > > > > > > > > > separate units, won't we need two quota
> configurations
> > > > > > regardless
> > > > > > > > of
> > > > > > > > > > > > whether we use units or percentage? Perhaps I
> > > misunderstood
> > > > > > your
> > > > > > > > > > > > suggestion.
> > > > > > > > > > > >
> > > > > > > > > > > > I think there are two cases:
> > > > > > > > > > > >
> > > > > > > > > > > >    1. The use case that you mentioned where an admin
> is
> > > > > adding
> > > > > > > more
> > > > > > > > > > users
> > > > > > > > > > > >    and decides to add more I/O threads and expects to
> > > find
> > > > > free
> > > > > > > > quota
> > > > > > > > > > to
> > > > > > > > > > > >    allocate for new users.
> > > > > > > > > > > >    2. Admin adds more I/O threads because the I/O
> > threads
> > > > are
> > > > > > > > > saturated
> > > > > > > > > > > and
> > > > > > > > > > > >    there are cores available to allocate, even though
> > the
> > > > > > number
> > > > > > > or
> > > > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > > > >
> > > > > > > > > > > > If we allocated treated I/O threads as a single unit
> of
> > > > 100%,
> > > > > > all
> > > > > > > > > user
> > > > > > > > > > > > quotas need to be reallocated for 1). If we allocated
> > I/O
> > > > > > threads
> > > > > > > > as
> > > > > > > > > n
> > > > > > > > > > > > units with n*100%, all user quotas need to be
> > reallocated
> > > > for
> > > > > > 2),
> > > > > > > > > > > otherwise
> > > > > > > > > > > > some of the new threads may just not be used. Either
> > way
> > > it
> > > > > > > should
> > > > > > > > be
> > > > > > > > > > > easy
> > > > > > > > > > > > to write a script to decrease/increase quotas by a
> > > multiple
> > > > > for
> > > > > > > all
> > > > > > > > > > > users.
> > > > > > > > > > > >
> > > > > > > > > > > > So it really boils down to which quota unit is most
> > > > intuitive
> > > > > > in
> > > > > > > > > terms
> > > > > > > > > > of
> > > > > > > > > > > > configuration. And from the discussion so far, it
> feels
> > > > like
> > > > > > > > opinion
> > > > > > > > > is
> > > > > > > > > > > > divided on whether quotas should be carved out of an
> > > > absolute
> > > > > > > 100%
> > > > > > > > > (or
> > > > > > > > > > 1
> > > > > > > > > > > > unit) or be relative to the number of threads (n*100%
> > or
> > > n
> > > > > > > units).
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <
> > > jun@confluent.io>
> > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Another way to express an absolute limit is to use
> > > > > > > > > > request.percentage,
> > > > > > > > > > > > but
> > > > > > > > > > > > > treat it as the percentage used in a single request
> > > > > handling
> > > > > > > > > thread.
> > > > > > > > > > > For
> > > > > > > > > > > > > now, the request handling threads can be just the
> io
> > > > > threads.
> > > > > > > In
> > > > > > > > > the
> > > > > > > > > > > > > future, they can cover the network threads as well.
> > > This
> > > > is
> > > > > > > > similar
> > > > > > > > > > to
> > > > > > > > > > > > how
> > > > > > > > > > > > > top reports CPU usage and may be a bit easier for
> > > people
> > > > to
> > > > > > > > > > understand.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jun
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <
> > > > > jun@confluent.io>
> > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2. Regarding request.unit vs request.percentage.
> I
> > > > > started
> > > > > > > with
> > > > > > > > > > > > > > request.percentage too. The reasoning for
> > > request.unit
> > > > is
> > > > > > the
> > > > > > > > > > > > following.
> > > > > > > > > > > > > > Suppose that the capacity has been reached on a
> > > broker
> > > > > and
> > > > > > > the
> > > > > > > > > > admin
> > > > > > > > > > > > > needs
> > > > > > > > > > > > > > to add a new user. A simple way to increase the
> > > > capacity
> > > > > is
> > > > > > > to
> > > > > > > > > > > increase
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > number of io threads, assuming there are still
> > enough
> > > > > > cores.
> > > > > > > If
> > > > > > > > > the
> > > > > > > > > > > > limit
> > > > > > > > > > > > > > is based on percentage, the additional capacity
> > > > > > automatically
> > > > > > > > > gets
> > > > > > > > > > > > > > distributed to existing users and we haven't
> really
> > > > > carved
> > > > > > > out
> > > > > > > > > any
> > > > > > > > > > > > > > additional resource for the new user. Now, is it
> > easy
> > > > > for a
> > > > > > > > user
> > > > > > > > > to
> > > > > > > > > > > > > reason
> > > > > > > > > > > > > > about 0.1 unit vs 10%. My feeling is that both
> are
> > > hard
> > > > > and
> > > > > > > > have
> > > > > > > > > to
> > > > > > > > > > > be
> > > > > > > > > > > > > > configured empirically. Not sure if percentage is
> > > > > obviously
> > > > > > > > > easier
> > > > > > > > > > to
> > > > > > > > > > > > > > reason about.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jun
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <
> > > > > > jay@confluent.io
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> 1. Even though the implementation of this quota
> is
> > > > only
> > > > > > > using
> > > > > > > > io
> > > > > > > > > > > > thread
> > > > > > > > > > > > > >> time, i think we should call it something like
> > > > > > > "request-time".
> > > > > > > > > > This
> > > > > > > > > > > > will
> > > > > > > > > > > > > >> give us flexibility to improve the
> implementation
> > to
> > > > > cover
> > > > > > > > > network
> > > > > > > > > > > > > threads
> > > > > > > > > > > > > >> in the future and will avoid exposing internal
> > > details
> > > > > > like
> > > > > > > > our
> > > > > > > > > > > thread
> > > > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> 2. Jun/Roger, I get what you are trying to fix
> but
> > > the
> > > > > > idea
> > > > > > > of
> > > > > > > > > > > > > >> thread/units
> > > > > > > > > > > > > >> is super unintuitive as a user-facing knob. I
> had
> > to
> > > > > read
> > > > > > > the
> > > > > > > > > KIP
> > > > > > > > > > > like
> > > > > > > > > > > > > >> eight times to understand this. I'm not sure
> that
> > > your
> > > > > > point
> > > > > > > > > that
> > > > > > > > > > > > > >> increasing the number of threads is a problem
> > with a
> > > > > > > > > > > percentage-based
> > > > > > > > > > > > > >> value, it really depends on whether the user
> > thinks
> > > > > about
> > > > > > > the
> > > > > > > > > > > > > "percentage
> > > > > > > > > > > > > >> of request processing time" or "thread units".
> If
> > > they
> > > > > > think
> > > > > > > > "I
> > > > > > > > > > have
> > > > > > > > > > > > > >> allocated 10% of my request processing time to
> > user
> > > x"
> > > > > > then
> > > > > > > it
> > > > > > > > > is
> > > > > > > > > > a
> > > > > > > > > > > > bug
> > > > > > > > > > > > > >> that increasing the thread count decreases that
> > > > percent
> > > > > as
> > > > > > > it
> > > > > > > > > does
> > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> current proposal. As a practical matter I think
> > the
> > > > only
> > > > > > way
> > > > > > > > to
> > > > > > > > > > > > actually
> > > > > > > > > > > > > >> reason about this is as a percent---I just don't
> > > > believe
> > > > > > > > people
> > > > > > > > > > are
> > > > > > > > > > > > > going
> > > > > > > > > > > > > >> to think, "ah, 4.3 thread units, that is the
> right
> > > > > > amount!".
> > > > > > > > > > > Instead I
> > > > > > > > > > > > > >> think they have to understand this thread unit
> > > > concept,
> > > > > > > figure
> > > > > > > > > out
> > > > > > > > > > > > what
> > > > > > > > > > > > > >> they have set in number of threads, compute a
> > > percent
> > > > > and
> > > > > > > then
> > > > > > > > > > come
> > > > > > > > > > > up
> > > > > > > > > > > > > >> with
> > > > > > > > > > > > > >> the number of thread units, and these will all
> be
> > > > wrong
> > > > > if
> > > > > > > > that
> > > > > > > > > > > thread
> > > > > > > > > > > > > >> count changes. I also think this ties us to
> > > throttling
> > > > > the
> > > > > > > I/O
> > > > > > > > > > > thread
> > > > > > > > > > > > > >> pool,
> > > > > > > > > > > > > >> which may not be where we want to end up.
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> 3. For what it's worth I do think having a
> single
> > > > > > > throttle_ms
> > > > > > > > > > field
> > > > > > > > > > > in
> > > > > > > > > > > > > all
> > > > > > > > > > > > > >> the responses that combines all throttling from
> > all
> > > > > quotas
> > > > > > > is
> > > > > > > > > > > probably
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> simplest. There could be a use case for having
> > > > separate
> > > > > > > fields
> > > > > > > > > for
> > > > > > > > > > > > each,
> > > > > > > > > > > > > >> but I think that is actually harder to
> use/monitor
> > > in
> > > > > the
> > > > > > > > common
> > > > > > > > > > > case
> > > > > > > > > > > > so
> > > > > > > > > > > > > >> unless someone has a use case I think just one
> > > should
> > > > be
> > > > > > > fine.
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram
> <
> > > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > >> > I have updated the KIP based on the
> discussions
> > so
> > > > > far.
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Regards,
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > Rajini
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini
> > Sivaram <
> > > > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Ismael #1. It makes sense not to throttle
> > > > > inter-broker
> > > > > > > > > > requests
> > > > > > > > > > > > like
> > > > > > > > > > > > > >> > > LeaderAndIsr etc. The simplest way to ensure
> > > that
> > > > > > > clients
> > > > > > > > > > cannot
> > > > > > > > > > > > use
> > > > > > > > > > > > > >> > these
> > > > > > > > > > > > > >> > > requests to bypass quotas for DoS attacks is
> > to
> > > > > ensure
> > > > > > > > that
> > > > > > > > > > ACLs
> > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > >> > > clients from using these requests and
> > > unauthorized
> > > > > > > > requests
> > > > > > > > > > are
> > > > > > > > > > > > > >> included
> > > > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking that
> these
> > > > quotas
> > > > > > can
> > > > > > > > > > return
> > > > > > > > > > > a
> > > > > > > > > > > > > >> > separate
> > > > > > > > > > > > > >> > > throttle time, and all utilization based
> > quotas
> > > > > could
> > > > > > > use
> > > > > > > > > the
> > > > > > > > > > > same
> > > > > > > > > > > > > >> field
> > > > > > > > > > > > > >> > > (we won't add another one for network thread
> > > > > > utilization
> > > > > > > > for
> > > > > > > > > > > > > >> instance).
> > > > > > > > > > > > > >> > But
> > > > > > > > > > > > > >> > > perhaps it makes sense to keep byte rate
> > quotas
> > > > > > separate
> > > > > > > > in
> > > > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > > > >> > > responses to provide separate metrics? Agree
> > > with
> > > > > > Ismael
> > > > > > > > > that
> > > > > > > > > > > the
> > > > > > > > > > > > > >> name of
> > > > > > > > > > > > > >> > > the existing field should be changed if we
> > have
> > > > two.
> > > > > > > Happy
> > > > > > > > > to
> > > > > > > > > > > > switch
> > > > > > > > > > > > > >> to a
> > > > > > > > > > > > > >> > > single combined throttle time if that is
> > > > sufficient.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use
> > dot
> > > > > > > separated
> > > > > > > > > > name
> > > > > > > > > > > > for
> > > > > > > > > > > > > >> new
> > > > > > > > > > > > > >> > > property. Replication quotas use dot
> > separated,
> > > so
> > > > > it
> > > > > > > will
> > > > > > > > > be
> > > > > > > > > > > > > >> consistent
> > > > > > > > > > > > > >> > > with all properties except byte rate quotas.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Radai: #1 Request processing time rather
> than
> > > > > request
> > > > > > > rate
> > > > > > > > > > were
> > > > > > > > > > > > > chosen
> > > > > > > > > > > > > >> > > because the time per request can vary
> > > > significantly
> > > > > > > > between
> > > > > > > > > > > > requests
> > > > > > > > > > > > > >> as
> > > > > > > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > > > > > > >> > > #2 Two separate quotas for
> heartbeats/regular
> > > > > requests
> > > > > > > > feel
> > > > > > > > > > like
> > > > > > > > > > > > > more
> > > > > > > > > > > > > >> > > configuration and more metrics. Since most
> > users
> > > > > would
> > > > > > > set
> > > > > > > > > > > quotas
> > > > > > > > > > > > > >> higher
> > > > > > > > > > > > > >> > > than the expected usage and quotas are more
> > of a
> > > > > > safety
> > > > > > > > > net, a
> > > > > > > > > > > > > single
> > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > > > >> > >  #3 The number of requests in purgatory is
> > > limited
> > > > > by
> > > > > > > the
> > > > > > > > > > number
> > > > > > > > > > > > of
> > > > > > > > > > > > > >> > active
> > > > > > > > > > > > > >> > > connections since only one request per
> > > connection
> > > > > will
> > > > > > > be
> > > > > > > > > > > > throttled
> > > > > > > > > > > > > >> at a
> > > > > > > > > > > > > >> > > time.
> > > > > > > > > > > > > >> > > #4 As with byte rate quotas, to use the full
> > > > > allocated
> > > > > > > > > quotas,
> > > > > > > > > > > > > >> > > clients/users would need to use partitions
> > that
> > > > are
> > > > > > > > > > distributed
> > > > > > > > > > > > > across
> > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > >> > > cluster. The alternative of using
> cluster-wide
> > > > > quotas
> > > > > > > > > instead
> > > > > > > > > > of
> > > > > > > > > > > > > >> > per-broker
> > > > > > > > > > > > > >> > > quotas would be far too complex to
> implement.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Dong : We currently have two
> > ClientQuotaManagers
> > > > for
> > > > > > > quota
> > > > > > > > > > types
> > > > > > > > > > > > > Fetch
> > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > >> > > Produce. A new one will be added for
> IOThread,
> > > > which
> > > > > > > > manages
> > > > > > > > > > > > quotas
> > > > > > > > > > > > > >> for
> > > > > > > > > > > > > >> > I/O
> > > > > > > > > > > > > >> > > thread utilization. This will not update the
> > > Fetch
> > > > > or
> > > > > > > > > Produce
> > > > > > > > > > > > > >> queue-size,
> > > > > > > > > > > > > >> > > but will have a separate metric for the
> > > > > queue-size.  I
> > > > > > > > > wasn't
> > > > > > > > > > > > > >> planning to
> > > > > > > > > > > > > >> > > add any additional metrics apart from the
> > > > equivalent
> > > > > > > ones
> > > > > > > > > for
> > > > > > > > > > > > > existing
> > > > > > > > > > > > > >> > > quotas as part of this KIP. Ratio of
> byte-rate
> > > to
> > > > > I/O
> > > > > > > > thread
> > > > > > > > > > > > > >> utilization
> > > > > > > > > > > > > >> > > could be slightly misleading since it
> depends
> > on
> > > > the
> > > > > > > > > sequence
> > > > > > > > > > of
> > > > > > > > > > > > > >> > requests.
> > > > > > > > > > > > > >> > > But we can look into more metrics after the
> > KIP
> > > is
> > > > > > > > > implemented
> > > > > > > > > > > if
> > > > > > > > > > > > > >> > required.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > I think we need to limit the maximum delay
> > since
> > > > all
> > > > > > > > > requests
> > > > > > > > > > > are
> > > > > > > > > > > > > >> > > throttled. If a client has a quota of 0.001
> > > units
> > > > > and
> > > > > > a
> > > > > > > > > single
> > > > > > > > > > > > > request
> > > > > > > > > > > > > >> > used
> > > > > > > > > > > > > >> > > 50ms, we don't want to delay all requests
> from
> > > the
> > > > > > > client
> > > > > > > > by
> > > > > > > > > > 50
> > > > > > > > > > > > > >> seconds,
> > > > > > > > > > > > > >> > > throwing the client out of all its consumer
> > > > groups.
> > > > > > The
> > > > > > > > > issue
> > > > > > > > > > is
> > > > > > > > > > > > > only
> > > > > > > > > > > > > >> if
> > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > >> > > user is allocated a quota that is
> insufficient
> > > to
> > > > > > > process
> > > > > > > > > one
> > > > > > > > > > > > large
> > > > > > > > > > > > > >> > > request. The expectation is that the units
> > > > allocated
> > > > > > per
> > > > > > > > > user
> > > > > > > > > > > will
> > > > > > > > > > > > > be
> > > > > > > > > > > > > >> > much
> > > > > > > > > > > > > >> > > higher than the time taken to process one
> > > request
> > > > > and
> > > > > > > the
> > > > > > > > > > limit
> > > > > > > > > > > > > should
> > > > > > > > > > > > > >> > > seldom be applied. Agree this needs proper
> > > > > > > documentation.
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > >> > >
> > > > > > > > > > > > > >> > >> @jun: i wasnt concerned about tying up a
> > > request
> > > > > > > > processing
> > > > > > > > > > > > thread,
> > > > > > > > > > > > > >> but
> > > > > > > > > > > > > >> > >> IIUC the code does still read the entire
> > > request
> > > > > out,
> > > > > > > > which
> > > > > > > > > > > might
> > > > > > > > > > > > > >> add-up
> > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin
> <
> > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > >> > >>
> > > > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > >> > >> > The current KIP says that the maximum
> delay
> > > > will
> > > > > be
> > > > > > > > > reduced
> > > > > > > > > > > to
> > > > > > > > > > > > > >> window
> > > > > > > > > > > > > >> > >> size
> > > > > > > > > > > > > >> > >> > if it is larger than the window size. I
> > have
> > > a
> > > > > > > concern
> > > > > > > > > with
> > > > > > > > > > > > this:
> > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > >> > >> > 1) This essentially means that the user
> is
> > > > > allowed
> > > > > > to
> > > > > > > > > > exceed
> > > > > > > > > > > > > their
> > > > > > > > > > > > > >> > quota
> > > > > > > > > > > > > >> > >> > over a long period of time. Can you
> provide
> > > an
> > > > > > upper
> > > > > > > > > bound
> > > > > > > > > > on
> > > > > > > > > > > > > this
> > > > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > >> > >> > 2) What is the motivation for cap the
> > maximum
> > > > > delay
> > > > > > > by
> > > > > > > > > the
> > > > > > > > > > > > window
> > > > > > > > > > > > > >> > size?
> > > > > > > > > > > > > >> > >> I
> > > > > > > > > > > > > >> > >> > am wondering if there is better
> alternative
> > > to
> > > > > > > address
> > > > > > > > > the
> > > > > > > > > > > > > problem.
> > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > >> > >> > 3) It means that the existing
> > metric-related
> > > > > config
> > > > > > > > will
> > > > > > > > > > > have a
> > > > > > > > > > > > > >> more
> > > > > > > > > > > > > >> > >> > directly impact on the mechanism of this
> > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > >> quota.
> > > > > > > > > > > > > >> > The
> > > > > > > > > > > > > >> > >> > may be an important change depending on
> the
> > > > > answer
> > > > > > to
> > > > > > > > 1)
> > > > > > > > > > > above.
> > > > > > > > > > > > > We
> > > > > > > > > > > > > >> > >> probably
> > > > > > > > > > > > > >> > >> > need to document this more explicitly.
> > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong
> Lin
> > <
> > > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > >> > >> > > Yeah you are right. I thought it wasn't
> > > > because
> > > > > > at
> > > > > > > > > > LinkedIn
> > > > > > > > > > > > it
> > > > > > > > > > > > > >> will
> > > > > > > > > > > > > >> > be
> > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > >> > >> > > much pressure on inGraph to expose
> those
> > > > > > > per-clientId
> > > > > > > > > > > metrics
> > > > > > > > > > > > > so
> > > > > > > > > > > > > >> we
> > > > > > > > > > > > > >> > >> ended
> > > > > > > > > > > > > >> > >> > > up printing them periodically to local
> > log.
> > > > > Never
> > > > > > > > mind
> > > > > > > > > if
> > > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > >> not
> > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > >> > >> > > - I agree with Jay that we probably
> don't
> > > > want
> > > > > to
> > > > > > > > add a
> > > > > > > > > > new
> > > > > > > > > > > > > field
> > > > > > > > > > > > > >> > for
> > > > > > > > > > > > > >> > >> > > every quota ProduceResponse or
> > > FetchResponse.
> > > > > Is
> > > > > > > > there
> > > > > > > > > > any
> > > > > > > > > > > > > >> use-case
> > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > >> > >> > > having separate throttle-time fields
> for
> > > > > > > > > byte-rate-quota
> > > > > > > > > > > and
> > > > > > > > > > > > > >> > >> > > io-thread-unit-quota? You probably need
> > to
> > > > > > document
> > > > > > > > > this
> > > > > > > > > > as
> > > > > > > > > > > > > >> > interface
> > > > > > > > > > > > > >> > >> > > change if you plan to add new field in
> > any
> > > > > > request.
> > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > >> > >> > > - I don't think IOThread belongs to
> > > > quotaType.
> > > > > > The
> > > > > > > > > > existing
> > > > > > > > > > > > > quota
> > > > > > > > > > > > > >> > >> types
> > > > > > > > > > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > > > > > > > > > n/FollowerReplication)
> > > > > > > > > > > > > >> identify
> > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > >> > >> > > type of request that are throttled, not
> > the
> > > > > quota
> > > > > > > > > > mechanism
> > > > > > > > > > > > > that
> > > > > > > > > > > > > >> is
> > > > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > >> > >> > > - If a request is throttled due to this
> > > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > > >> quota,
> > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > >> > >> > > existing queue-size metric in
> > > > > ClientQuotaManager
> > > > > > > > > > > incremented?
> > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > >> > >> > > - In the interest of providing guide
> line
> > > for
> > > > > > admin
> > > > > > > > to
> > > > > > > > > > > decide
> > > > > > > > > > > > > >> > >> > > io-thread-unit-based quota and for user
> > to
> > > > > > > understand
> > > > > > > > > its
> > > > > > > > > > > > > impact
> > > > > > > > > > > > > >> on
> > > > > > > > > > > > > >> > >> their
> > > > > > > > > > > > > >> > >> > > traffic, would it be useful to have a
> > > metric
> > > > > that
> > > > > > > > shows
> > > > > > > > > > the
> > > > > > > > > > > > > >> overall
> > > > > > > > > > > > > >> > >> > > byte-rate per io-thread-unit? Can we
> also
> > > > show
> > > > > > > this a
> > > > > > > > > > > > > >> per-clientId
> > > > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun
> Rao
> > <
> > > > > > > > > > jun@confluent.io
> > > > > > > > > > > >
> > > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > >> > >> > >> For #3, typically, an admin won't
> > > configure
> > > > > more
> > > > > > > io
> > > > > > > > > > > threads
> > > > > > > > > > > > > than
> > > > > > > > > > > > > >> > CPU
> > > > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > > > >> > >> > >> but it's possible for an admin to
> start
> > > with
> > > > > > fewer
> > > > > > > > io
> > > > > > > > > > > > threads
> > > > > > > > > > > > > >> than
> > > > > > > > > > > > > >> > >> cores
> > > > > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > >> > >> > >> I think the throttleTime sensor on the
> > > > broker
> > > > > > > tells
> > > > > > > > > the
> > > > > > > > > > > > admin
> > > > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > >> > >> > >> The reasoning for delaying the
> throttled
> > > > > > requests
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > > > > broker
> > > > > > > > > > > > > >> > >> instead
> > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > >> > >> > >> returning an error immediately is that
> > the
> > > > > > latter
> > > > > > > > has
> > > > > > > > > no
> > > > > > > > > > > way
> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > >> > >> > >> client from retrying immediately,
> which
> > > will
> > > > > > make
> > > > > > > > > things
> > > > > > > > > > > > > worse.
> > > > > > > > > > > > > >> The
> > > > > > > > > > > > > >> > >> > >> delaying logic is based off a delay
> > > queue. A
> > > > > > > > separate
> > > > > > > > > > > > > expiration
> > > > > > > > > > > > > >> > >> thread
> > > > > > > > > > > > > >> > >> > >> just waits on the next to be expired
> > > > request.
> > > > > > So,
> > > > > > > it
> > > > > > > > > > > doesn't
> > > > > > > > > > > > > tie
> > > > > > > > > > > > > >> > up a
> > > > > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM,
> Ismael
> > > > Juma <
> > > > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > > > >> >
> > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > >> > >> > >> > Regarding 1, I definitely like the
> > > > > simplicity
> > > > > > of
> > > > > > > > > > > keeping a
> > > > > > > > > > > > > >> single
> > > > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > > > >> > >> > >> > time field in the response. The
> > downside
> > > > is
> > > > > > that
> > > > > > > > the
> > > > > > > > > > > > client
> > > > > > > > > > > > > >> > metrics
> > > > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > > > percentage`
> > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay
> > > > Kreps <
> > > > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > >> > >> > >> > >    1. Isn't it the case that the
> > > > > throttling
> > > > > > > time
> > > > > > > > > > > > response
> > > > > > > > > > > > > >> field
> > > > > > > > > > > > > >> > >> > should
> > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > >> > >> > >> > >    the total time your request was
> > > > > throttled
> > > > > > > > > > > > irrespective
> > > > > > > > > > > > > of
> > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > > > >> > >> > >> > >    caused that. Limiting it to
> byte
> > > rate
> > > > > > quota
> > > > > > > > > > doesn't
> > > > > > > > > > > > > make
> > > > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > >> > >> > >> > >    I don't think we want to end up
> > > > adding
> > > > > > new
> > > > > > > > > fields
> > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> > >> response
> > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > > > > > > > > > >> > >> > >> > >    2. I don't think we should make
> > > this
> > > > > > quota
> > > > > > > > > > > > specifically
> > > > > > > > > > > > > >> > about
> > > > > > > > > > > > > >> > >> io
> > > > > > > > > > > > > >> > >> > >> > >    threads. Once we introduce
> these
> > > > quotas
> > > > > > > > people
> > > > > > > > > > set
> > > > > > > > > > > > them
> > > > > > > > > > > > > >> and
> > > > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > >> > >> > >> > >    be enforced (and if they aren't
> > it
> > > > may
> > > > > > > cause
> > > > > > > > an
> > > > > > > > > > > > > outage).
> > > > > > > > > > > > > >> As
> > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > > > >> > >> > >> > >    are a bit more sensitive than
> > > normal
> > > > > > > > configs, I
> > > > > > > > > > > > think.
> > > > > > > > > > > > > >> The
> > > > > > > > > > > > > >> > >> > current
> > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > >> > >> > >> > >    pools seem like something of an
> > > > > > > > implementation
> > > > > > > > > > > detail
> > > > > > > > > > > > > and
> > > > > > > > > > > > > >> > not
> > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > >> > >> > >> > >    user-facing quotas should be
> > > involved
> > > > > > > with. I
> > > > > > > > > > think
> > > > > > > > > > > > it
> > > > > > > > > > > > > >> might
> > > > > > > > > > > > > >> > >> be
> > > > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > >> > >> > >> > >    make this a general
> request-time
> > > > > throttle
> > > > > > > > with
> > > > > > > > > no
> > > > > > > > > > > > > >> mention in
> > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > > > >> > >> > >> > >    about I/O threads and simply
> > > > > acknowledge
> > > > > > > the
> > > > > > > > > > > current
> > > > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > > > >> > >> > >> > >    may someday fix) in the docs
> that
> > > > this
> > > > > > > covers
> > > > > > > > > > only
> > > > > > > > > > > > the
> > > > > > > > > > > > > >> time
> > > > > > > > > > > > > >> > >> after
> > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > >> > >> > >> > >    thread is read off the network.
> > > > > > > > > > > > > >> > >> > >> > >    3. As such I think the right
> > > > interface
> > > > > to
> > > > > > > the
> > > > > > > > > > user
> > > > > > > > > > > > > would
> > > > > > > > > > > > > >> be
> > > > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > > > >> > >> > >> > >    like percent_request_time and
> be
> > in
> > > > > > > > {0,...100}
> > > > > > > > > or
> > > > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think
> "ratio"
> > > is
> > > > > the
> > > > > > > > > > > terminology
> > > > > > > > > > > > we
> > > > > > > > > > > > > >> used
> > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > > > >> > >> > >> > >    is between 0 and 1 in the other
> > > > > metrics,
> > > > > > > > > right?)
> > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM,
> > > Rajini
> > > > > > > Sivaram
> > > > > > > > <
> > > > > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > >> > >> > >> > > > Guozhang : I have updated the
> > > section
> > > > on
> > > > > > > > > > > co-existence
> > > > > > > > > > > > of
> > > > > > > > > > > > > >> byte
> > > > > > > > > > > > > >> > >> rate
> > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added much detail
> > to
> > > > the
> > > > > > > > metrics
> > > > > > > > > > and
> > > > > > > > > > > > > >> sensors
> > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > > > >> > >> > >> > > > going to be very similar to the
> > > > existing
> > > > > > > > metrics
> > > > > > > > > > and
> > > > > > > > > > > > > >> sensors.
> > > > > > > > > > > > > >> > >> To
> > > > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > > > >> > >> > >> > > > confusion, I have now added more
> > > > detail.
> > > > > > All
> > > > > > > > > > metrics
> > > > > > > > > > > > are
> > > > > > > > > > > > > >> in
> > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > > > >> > >> > >> > > > "quotaType" and all sensors have
> > > names
> > > > > > > > starting
> > > > > > > > > > with
> > > > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/
> > > > > > > LeaderReplication/
> > > > > > > > > > > > > >> > >> > >> > > > FollowerReplication/*IOThread*
> ).
> > > > > > > > > > > > > >> > >> > >> > > > So there will be no reuse of
> > > existing
> > > > > > > > > > > metrics/sensors.
> > > > > > > > > > > > > The
> > > > > > > > > > > > > >> > new
> > > > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > > >> > >> > >> > > > request processing time based
> > > > throttling
> > > > > > > will
> > > > > > > > be
> > > > > > > > > > > > > >> completely
> > > > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > >> > >> > >> > > > existing metrics/sensors, but
> will
> > > be
> > > > > > > > consistent
> > > > > > > > > > in
> > > > > > > > > > > > > >> format.
> > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > >> > >> > >> > > > The existing throttle_time_ms
> > field
> > > in
> > > > > > > > > > produce/fetch
> > > > > > > > > > > > > >> > responses
> > > > > > > > > > > > > >> > >> > will
> > > > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > >> > >> > >> > > > impacted by this KIP. That will
> > > > continue
> > > > > > to
> > > > > > > > > return
> > > > > > > > > > > > > >> byte-rate
> > > > > > > > > > > > > >> > >> based
> > > > > > > > > > > > > >> > >> > >> > > > throttling times. In addition, a
> > new
> > > > > field
> > > > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > >> > >> > >> > > > added to return request quota
> > based
> > > > > > > throttling
> > > > > > > > > > > times.
> > > > > > > > > > > > > >> These
> > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > > > >> > >> > >> > > > as new metrics on the
> client-side.
> > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > >> > >> > >> > > > Since all metrics and sensors
> are
> > > > > > different
> > > > > > > > for
> > > > > > > > > > each
> > > > > > > > > > > > > type
> > > > > > > > > > > > > >> of
> > > > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > > > >> > >> > >> > > > believe there is already
> > sufficient
> > > > > > metrics
> > > > > > > to
> > > > > > > > > > > monitor
> > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > > > >> > >> > >> > > > client and broker side for each
> > type
> > > > of
> > > > > > > > > > throttling.
> > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM,
> > > Dong
> > > > > Lin
> > > > > > <
> > > > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > I think it makes a lot of
> sense
> > to
> > > > use
> > > > > > > > > > > > io_thread_units
> > > > > > > > > > > > > >> as
> > > > > > > > > > > > > >> > >> metric
> > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > >> > >> > >> > > > > user's traffic here. LGTM
> > > overall. I
> > > > > > have
> > > > > > > > some
> > > > > > > > > > > > > questions
> > > > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > - Can you be more specific in
> > the
> > > > KIP
> > > > > > what
> > > > > > > > > > sensors
> > > > > > > > > > > > > will
> > > > > > > > > > > > > >> be
> > > > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > >> > >> > >> > > > > example, it will be useful to
> > > > specify
> > > > > > the
> > > > > > > > name
> > > > > > > > > > and
> > > > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > - We currently have
> > throttle-time
> > > > and
> > > > > > > > > queue-size
> > > > > > > > > > > for
> > > > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > > > >> > >> > >> > > > > Are you going to have separate
> > > > > > > throttle-time
> > > > > > > > > and
> > > > > > > > > > > > > >> queue-size
> > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > > > >> > >> > >> > > > > throttled by
> > io_thread_unit-based
> > > > > quota,
> > > > > > > or
> > > > > > > > > will
> > > > > > > > > > > > they
> > > > > > > > > > > > > >> share
> > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > - Does the throttle-time in
> the
> > > > > > > > > ProduceResponse
> > > > > > > > > > > and
> > > > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > > > >> > >> > >> > > > > time due to
> io_thread_unit-based
> > > > > quota?
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > - Currently kafka server
> doesn't
> > > not
> > > > > > > provide
> > > > > > > > > any
> > > > > > > > > > > log
> > > > > > > > > > > > > or
> > > > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > > > >> > >> > >> > > > > whether any given clientId (or
> > > user)
> > > > > is
> > > > > > > > > > throttled.
> > > > > > > > > > > > > This
> > > > > > > > > > > > > >> is
> > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > > > >> > >> > >> > > > > because we can still check the
> > > > > > client-side
> > > > > > > > > > > byte-rate
> > > > > > > > > > > > > >> metric
> > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > > > >> > >> > >> > > > > whether a given client is
> > > throttled.
> > > > > But
> > > > > > > > with
> > > > > > > > > > this
> > > > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > > > >> > >> > >> > > > > will be no way to validate
> > > whether a
> > > > > > given
> > > > > > > > > > client
> > > > > > > > > > > is
> > > > > > > > > > > > > >> slow
> > > > > > > > > > > > > >> > >> > because
> > > > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > > > >> > >> > >> > > > > exceeded its io_thread_unit
> > limit.
> > > > It
> > > > > is
> > > > > > > > > > necessary
> > > > > > > > > > > > for
> > > > > > > > > > > > > >> user
> > > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > >> > >> > >> > > > > know this information to
> figure
> > > how
> > > > > > > whether
> > > > > > > > > they
> > > > > > > > > > > > have
> > > > > > > > > > > > > >> > reached
> > > > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > >> > >> > >> > > > > limit. How about we add log4j
> > log
> > > on
> > > > > the
> > > > > > > > > server
> > > > > > > > > > > side
> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > > > > byte-rate-throttle-time,
> > > > > > > > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > > > >> > >> > >> > > > > that kafka administrator can
> > > figure
> > > > > > those
> > > > > > > > > users
> > > > > > > > > > > that
> > > > > > > > > > > > > >> have
> > > > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46
> PM,
> > > > > > Guozhang
> > > > > > > > > Wang <
> > > > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > Made a pass over the doc,
> > > overall
> > > > > LGTM
> > > > > > > > > except
> > > > > > > > > > a
> > > > > > > > > > > > > minor
> > > > > > > > > > > > > >> > >> comment
> > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > Stated as "Request
> processing
> > > time
> > > > > > > > > throttling
> > > > > > > > > > > will
> > > > > > > > > > > > > be
> > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > > > >> > >> > >> > > > > > necessary." I thought that
> it
> > > > meant
> > > > > > the
> > > > > > > > > > request
> > > > > > > > > > > > > >> > processing
> > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > > > >> > >> > >> > > > > > is applied first, but
> continue
> > > > > > reading I
> > > > > > > > > found
> > > > > > > > > > > it
> > > > > > > > > > > > > >> > actually
> > > > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > > > >> > >> > >> > > > > > produce / fetch byte rate
> > > > throttling
> > > > > > > > first.
> > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > Also the last sentence "The
> > > > > remaining
> > > > > > > > delay
> > > > > > > > > if
> > > > > > > > > > > any
> > > > > > > > > > > > > is
> > > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > > > > > response." is a bit
> confusing
> > to
> > > > me.
> > > > > > > Maybe
> > > > > > > > > > > > rewording
> > > > > > > > > > > > > >> it a
> > > > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24
> > PM,
> > > > Jun
> > > > > > > Rao <
> > > > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > Thanks for the updated
> KIP.
> > > The
> > > > > > latest
> > > > > > > > > > > proposal
> > > > > > > > > > > > > >> looks
> > > > > > > > > > > > > >> > >> good
> > > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at
> 2:19
> > > PM,
> > > > > > > Rajini
> > > > > > > > > > > Sivaram
> > > > > > > > > > > > <
> > > > > > > > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > Thank you for the
> > feedback.
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > 1. I have updated the
> KIP
> > to
> > > > use
> > > > > > > > > absolute
> > > > > > > > > > > > units
> > > > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > > > >> > >> > >> > > > > > > > property is called*
> > > > > > io_thread_units*
> > > > > > > > to
> > > > > > > > > > > align
> > > > > > > > > > > > > with
> > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > > > >> > >> > >> > > > > > > > property
> *num.io.threads*.
> > > > When
> > > > > we
> > > > > > > > > > implement
> > > > > > > > > > > > > >> network
> > > > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas, we can add
> another
> > > > > > property
> > > > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is
> > > > already
> > > > > > > > listed
> > > > > > > > > > > under
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > > > >> > >> > >> > > > > > > > you mean a different
> > request
> > > > > that
> > > > > > > > needs
> > > > > > > > > to
> > > > > > > > > > > be
> > > > > > > > > > > > > >> added?
> > > > > > > > > > > > > >> > >> The
> > > > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > >> > >> > >> > > > > > > > currently exempt in the
> > KIP
> > > > are
> > > > > > > > > > StopReplica,
> > > > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and
> > > > UpdateMetadata.
> > > > > > > These
> > > > > > > > > are
> > > > > > > > > > > > > >> controlled
> > > > > > > > > > > > > >> > >> > using
> > > > > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy to
> > > exclude
> > > > > and
> > > > > > > only
> > > > > > > > > > > > throttle
> > > > > > > > > > > > > if
> > > > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > > > >> > >> > >> > > > > > > > sure if there are other
> > > > requests
> > > > > > > used
> > > > > > > > > only
> > > > > > > > > > > for
> > > > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was thinking the
> > > smallest
> > > > > > > change
> > > > > > > > > > would
> > > > > > > > > > > be
> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> > >> replace
> > > > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > *requestChannel.sendResponse()
> > > > *
> > > > > > > with
> > > > > > > > a
> > > > > > > > > > > local
> > > > > > > > > > > > > >> method
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > *sendResponseMaybeThrottle()*
> > > > > that
> > > > > > > > does
> > > > > > > > > > the
> > > > > > > > > > > > > >> > throttling
> > > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > > > >> > >> > >> > > > > > > > response. If we throttle
> > > first
> > > > > in
> > > > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > > > >> > >> > >> > > > > > > > within the method
> handling
> > > the
> > > > > > > request
> > > > > > > > > > will
> > > > > > > > > > > > not
> > > > > > > > > > > > > be
> > > > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > > > >> > >> > >> > > > > > > > throttling. We can look
> > into
> > > > > this
> > > > > > > > again
> > > > > > > > > > when
> > > > > > > > > > > > the
> > > > > > > > > > > > > >> PR
> > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at
> > 5:55
> > > > PM,
> > > > > > > Roger
> > > > > > > > > > > Hoover
> > > > > > > > > > > > <
> > > > > > > > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > Great to see this KIP
> > and
> > > > the
> > > > > > > > > excellent
> > > > > > > > > > > > > >> discussion.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's
> suggestion
> > > > makes
> > > > > > > sense.
> > > > > > > > > If
> > > > > > > > > > > my
> > > > > > > > > > > > > >> > >> application
> > > > > > > > > > > > > >> > >> > is
> > > > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > request handler unit,
> > then
> > > > > it's
> > > > > > as
> > > > > > > > if
> > > > > > > > > I
> > > > > > > > > > > > have a
> > > > > > > > > > > > > >> > Kafka
> > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > request handler thread
> > > > > dedicated
> > > > > > > to
> > > > > > > > > me.
> > > > > > > > > > > > > That's
> > > > > > > > > > > > > >> the
> > > > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > least.  That
> allocation
> > > > > doesn't
> > > > > > > > change
> > > > > > > > > > > even
> > > > > > > > > > > > if
> > > > > > > > > > > > > >> an
> > > > > > > > > > > > > >> > >> admin
> > > > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > size of the request
> > thread
> > > > > pool
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > > broker.
> > > > > > > > > > > > > >> > It's
> > > > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > abstraction that VMs
> and
> > > > > > > containers
> > > > > > > > > get
> > > > > > > > > > > from
> > > > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > While different client
> > > > access
> > > > > > > > patterns
> > > > > > > > > > can
> > > > > > > > > > > > use
> > > > > > > > > > > > > >> > wildly
> > > > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > request thread
> resources
> > > per
> > > > > > > > request,
> > > > > > > > > a
> > > > > > > > > > > > given
> > > > > > > > > > > > > >> > >> > application
> > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > have a stable access
> > > pattern
> > > > > and
> > > > > > > can
> > > > > > > > > > > figure
> > > > > > > > > > > > > out
> > > > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > "request thread units"
> > it
> > > > > needs
> > > > > > to
> > > > > > > > > meet
> > > > > > > > > > > it's
> > > > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017
> at
> > > 8:53
> > > > > AM,
> > > > > > > Jun
> > > > > > > > > > Rao <
> > > > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for the
> updated
> > > > KIP.
> > > > > A
> > > > > > > few
> > > > > > > > > more
> > > > > > > > > > > > > >> comments.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> > > > > > > > request_time_percent
> > > > > > > > > > is
> > > > > > > > > > > > that
> > > > > > > > > > > > > >> it's
> > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > >> > >> > an
> > > > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Let's say you give a
> > > user
> > > > a
> > > > > > 10%
> > > > > > > > > limit.
> > > > > > > > > > > If
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> > admin
> > > > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request handler
> > threads,
> > > > > that
> > > > > > > user
> > > > > > > > > now
> > > > > > > > > > > > > >> actually
> > > > > > > > > > > > > >> > has
> > > > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > capacity. This may
> > > confuse
> > > > > > > people
> > > > > > > > a
> > > > > > > > > > bit.
> > > > > > > > > > > > So,
> > > > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > based on an absolute
> > > > request
> > > > > > > > thread
> > > > > > > > > > unit
> > > > > > > > > > > > is
> > > > > > > > > > > > > >> > better.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > > > ControlledShutdownRequest
> > > > > > is
> > > > > > > > also
> > > > > > > > > > an
> > > > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be excluded from
> > > > throttling.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 3. Implementation
> > wise,
> > > I
> > > > am
> > > > > > > > > wondering
> > > > > > > > > > > if
> > > > > > > > > > > > > it's
> > > > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > time throttling
> first
> > in
> > > > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the throttling logic
> > in
> > > > each
> > > > > > > type
> > > > > > > > of
> > > > > > > > > > > > > request.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017
> > at
> > > > 5:58
> > > > > > AM,
> > > > > > > > > > Rajini
> > > > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank you for the
> > > > review.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I have reverted to
> > the
> > > > > > > original
> > > > > > > > > KIP
> > > > > > > > > > > that
> > > > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > utilization. At
> the
> > > > > moment,
> > > > > > it
> > > > > > > > > uses
> > > > > > > > > > > > > >> percentage,
> > > > > > > > > > > > > >> > >> but
> > > > > > > > > > > > > >> > >> > I
> > > > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > a fraction (out
> of 1
> > > > > instead
> > > > > > > of
> > > > > > > > > 100)
> > > > > > > > > > > if
> > > > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > from this
> discussion
> > > to
> > > > > the
> > > > > > > KIP.
> > > > > > > > > > Also
> > > > > > > > > > > > > added
> > > > > > > > > > > > > >> a
> > > > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > address network
> > thread
> > > > > > > > > utilization.
> > > > > > > > > > > The
> > > > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > "request_time_percent"
> > > > > with
> > > > > > > the
> > > > > > > > > > > > > expectation
> > > > > > > > > > > > > >> > that
> > > > > > > > > > > > > >> > >> it
> > > > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit for network
> > > thread
> > > > > > > > > utilization
> > > > > > > > > > > > when
> > > > > > > > > > > > > >> that
> > > > > > > > > > > > > >> > is
> > > > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > users have to set
> > only
> > > > one
> > > > > > > > config
> > > > > > > > > > for
> > > > > > > > > > > > the
> > > > > > > > > > > > > >> two
> > > > > > > > > > > > > >> > and
> > > > > > > > > > > > > >> > >> > not
> > > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the internal
> > > > distribution
> > > > > of
> > > > > > > the
> > > > > > > > > > work
> > > > > > > > > > > > > >> between
> > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > >> > >> > two
> > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22,
> 2017
> > > at
> > > > > > 12:23
> > > > > > > > AM,
> > > > > > > > > > Jun
> > > > > > > > > > > > Rao
> > > > > > > > > > > > > <
> > > > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for the
> > > > proposal.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > The benefit of
> > using
> > > > the
> > > > > > > > request
> > > > > > > > > > > > > >> processing
> > > > > > > > > > > > > >> > >> time
> > > > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > exactly what
> > people
> > > > have
> > > > > > > > said. I
> > > > > > > > > > > will
> > > > > > > > > > > > > just
> > > > > > > > > > > > > >> > >> expand
> > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > following case.
> > The
> > > > > > producer
> > > > > > > > > > sends a
> > > > > > > > > > > > > >> produce
> > > > > > > > > > > > > >> > >> > request
> > > > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > but compressed
> to
> > > > 100KB
> > > > > > with
> > > > > > > > > gzip.
> > > > > > > > > > > The
> > > > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > broker could
> take
> > > > 10-15
> > > > > > > > seconds,
> > > > > > > > > > > > during
> > > > > > > > > > > > > >> which
> > > > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > thread is
> > completely
> > > > > > > blocked.
> > > > > > > > In
> > > > > > > > > > > this
> > > > > > > > > > > > > >> case,
> > > > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request rate
> > > quota
> > > > > may
> > > > > > > be
> > > > > > > > > > > > effective
> > > > > > > > > > > > > in
> > > > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > another case. A
> > > > consumer
> > > > > > > group
> > > > > > > > > > > starts
> > > > > > > > > > > > > >> with 10
> > > > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > switches to 20
> > > > > instances.
> > > > > > > The
> > > > > > > > > > > request
> > > > > > > > > > > > > rate
> > > > > > > > > > > > > >> > will
> > > > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > actually load on
> > the
> > > > > > broker
> > > > > > > > may
> > > > > > > > > > not
> > > > > > > > > > > > > double
> > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > contains half of
> > the
> > > > > > > > partitions.
> > > > > > > > > > > > Request
> > > > > > > > > > > > > >> rate
> > > > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure in
> this
> > > > case.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > What we really
> > want
> > > is
> > > > > to
> > > > > > be
> > > > > > > > > able
> > > > > > > > > > to
> > > > > > > > > > > > > >> prevent
> > > > > > > > > > > > > >> > a
> > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > of the server
> side
> > > > > > > resources.
> > > > > > > > In
> > > > > > > > > > > this
> > > > > > > > > > > > > >> > >> particular
> > > > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > capacity of the
> > > > request
> > > > > > > > handler
> > > > > > > > > > > > > threads. I
> > > > > > > > > > > > > >> > >> agree
> > > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > intuitive for
> the
> > > > users
> > > > > to
> > > > > > > > > > determine
> > > > > > > > > > > > how
> > > > > > > > > > > > > >> to
> > > > > > > > > > > > > >> > set
> > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > this is not
> > > completely
> > > > > new
> > > > > > > and
> > > > > > > > > has
> > > > > > > > > > > > been
> > > > > > > > > > > > > >> done
> > > > > > > > > > > > > >> > in
> > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > already. For
> > > example,
> > > > > > Linux
> > > > > > > > > > cgroup (
> > > > > > > > > > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > Resource_Management_Guide/sec-
> > > > > > > > > > > > cpu.html)
> > > > > > > > > > > > > >> has
> > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> cpu.cfs_quota_us,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > which specifies
> > the
> > > > > total
> > > > > > > > amount
> > > > > > > > > > of
> > > > > > > > > > > > time
> > > > > > > > > > > > > >> in
> > > > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a
> cgroup
> > > can
> > > > > run
> > > > > > > > > during a
> > > > > > > > > > > one
> > > > > > > > > > > > > >> second
> > > > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > model the
> request
> > > > > handler
> > > > > > > > > threads
> > > > > > > > > > > in a
> > > > > > > > > > > > > >> > similar
> > > > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > request handler
> > > thread
> > > > > can
> > > > > > > be
> > > > > > > > 1
> > > > > > > > > > > > request
> > > > > > > > > > > > > >> > handler
> > > > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure a
> limit
> > on
> > > > how
> > > > > > > many
> > > > > > > > > > units
> > > > > > > > > > > > (say
> > > > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > > > >> > >> a
> > > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Regarding not
> > > > throttling
> > > > > > the
> > > > > > > > > > > internal
> > > > > > > > > > > > > >> broker
> > > > > > > > > > > > > >> > to
> > > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > do that.
> > > > Alternatively,
> > > > > we
> > > > > > > > could
> > > > > > > > > > > just
> > > > > > > > > > > > > let
> > > > > > > > > > > > > >> the
> > > > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for the kafka
> user
> > > (it
> > > > > may
> > > > > > > not
> > > > > > > > > be
> > > > > > > > > > > able
> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> do
> > > > > > > > > > > > > >> > >> that
> > > > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we want
> to
> > > be
> > > > > able
> > > > > > > to
> > > > > > > > > > > protect
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool too. The
> > > > difficult
> > > > > is
> > > > > > > > > mostly
> > > > > > > > > > > what
> > > > > > > > > > > > > >> Rajini
> > > > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttling the
> > > > requests
> > > > > is
> > > > > > > > > through
> > > > > > > > > > > > > >> Purgatory
> > > > > > > > > > > > > >> > >> and
> > > > > > > > > > > > > >> > >> > we
> > > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > through how to
> > > > integrate
> > > > > > > that
> > > > > > > > > into
> > > > > > > > > > > the
> > > > > > > > > > > > > >> > network
> > > > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > layer, currently
> > we
> > > > know
> > > > > > the
> > > > > > > > > user,
> > > > > > > > > > > but
> > > > > > > > > > > > > not
> > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit
> tricky
> > to
> > > > > > > throttle
> > > > > > > > > > based
> > > > > > > > > > > on
> > > > > > > > > > > > > >> > clientId
> > > > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota can
> already
> > > > > protect
> > > > > > > the
> > > > > > > > > > > network
> > > > > > > > > > > > > >> thread
> > > > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > requests. So, if
> > we
> > > > > can't
> > > > > > > > figure
> > > > > > > > > > out
> > > > > > > > > > > > > this
> > > > > > > > > > > > > >> > part
> > > > > > > > > > > > > >> > >> > right
> > > > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request
> > handling
> > > > > > threads
> > > > > > > > for
> > > > > > > > > > > this
> > > > > > > > > > > > > KIP
> > > > > > > > > > > > > >> is
> > > > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21,
> > 2017
> > > > at
> > > > > > 4:27
> > > > > > > > AM,
> > > > > > > > > > > > Rajini
> > > > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > rajinisivaram@gmail.com
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank you all
> > for
> > > > the
> > > > > > > > > feedback.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have
> > > removed
> > > > > > > > exemption
> > > > > > > > > > for
> > > > > > > > > > > > > >> consumer
> > > > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > protecting the
> > > > cluster
> > > > > > is
> > > > > > > > more
> > > > > > > > > > > > > important
> > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have retained
> > the
> > > > > > > exemption
> > > > > > > > > for
> > > > > > > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled only
> > if
> > > > > > > > > authorization
> > > > > > > > > > > > fails
> > > > > > > > > > > > > >> (so
> > > > > > > > > > > > > >> > >> can't
> > > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a secure
> > cluster,
> > > > but
> > > > > > > allows
> > > > > > > > > > > > > >> inter-broker
> > > > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I will wait
> > > another
> > > > > day
> > > > > > to
> > > > > > > > see
> > > > > > > > > > if
> > > > > > > > > > > > > these
> > > > > > > > > > > > > >> is
> > > > > > > > > > > > > >> > >> any
> > > > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > request
> > processing
> > > > > time
> > > > > > > (as
> > > > > > > > > > > opposed
> > > > > > > > > > > > to
> > > > > > > > > > > > > >> > >> request
> > > > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > objections, I
> > will
> > > > > > revert
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > > > >> original
> > > > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > The original
> > > > proposal
> > > > > > was
> > > > > > > > only
> > > > > > > > > > > > > including
> > > > > > > > > > > > > >> > the
> > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handler
> threads
> > > > (that
> > > > > > made
> > > > > > > > > > > > calculation
> > > > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > include the
> time
> > > > spent
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > > network
> > > > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant.
> As
> > > Jay
> > > > > > > pointed
> > > > > > > > > out,
> > > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > >> more
> > > > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > total
> available
> > > CPU
> > > > > time
> > > > > > > and
> > > > > > > > > > > convert
> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> a
> > > > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and *n*
> network
> > > > > threads.
> > > > > > > > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we want, but
> it
> > > can
> > > > be
> > > > > > > very
> > > > > > > > > > > > expensive
> > > > > > > > > > > > > on
> > > > > > > > > > > > > >> > some
> > > > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang have
> > > > pointed
> > > > > > out,
> > > > > > > > we
> > > > > > > > > do
> > > > > > > > > > > > have
> > > > > > > > > > > > > >> > several
> > > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > generating
> > metrics
> > > > > that
> > > > > > we
> > > > > > > > > could
> > > > > > > > > > > > use,
> > > > > > > > > > > > > >> > though
> > > > > > > > > > > > > >> > >> we
> > > > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime()
> > instead
> > > > of
> > > > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > small requests
> > may
> > > > be
> > > > > <
> > > > > > > 1ms.
> > > > > > > > > But
> > > > > > > > > > > > > rather
> > > > > > > > > > > > > >> > than
> > > > > > > > > > > > > >> > >> add
> > > > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thread and
> > network
> > > > > > thread,
> > > > > > > > > > > wouldn't
> > > > > > > > > > > > it
> > > > > > > > > > > > > >> be
> > > > > > > > > > > > > >> > >> better
> > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on each thread
> > > into
> > > > a
> > > > > > > > separate
> > > > > > > > > > > > ratio?
> > > > > > > > > > > > > >> UserA
> > > > > > > > > > > > > >> > >> has
> > > > > > > > > > > > > >> > >> > a
> > > > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we take that
> to
> > > mean
> > > > > > that
> > > > > > > > > UserA
> > > > > > > > > > > can
> > > > > > > > > > > > > use
> > > > > > > > > > > > > >> 5%
> > > > > > > > > > > > > >> > of
> > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of the
> > time
> > > > on
> > > > > > I/O
> > > > > > > > > > threads?
> > > > > > > > > > > > If
> > > > > > > > > > > > > >> > either
> > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled - it
> > > would
> > > > > > mean
> > > > > > > > > > > > maintaining
> > > > > > > > > > > > > >> two
> > > > > > > > > > > > > >> > >> sets
> > > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > durations, but
> > > would
> > > > > > > result
> > > > > > > > in
> > > > > > > > > > > more
> > > > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quota limits
> > > (UserA
> > > > > has
> > > > > > 5%
> > > > > > > > of
> > > > > > > > > > > > request
> > > > > > > > > > > > > >> > threads
> > > > > > > > > > > > > >> > >> > and
> > > > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > but that seems
> > > > > > unnecessary
> > > > > > > > and
> > > > > > > > > > > > harder
> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> > >> explain
> > > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back to why
> and
> > > how
> > > > > > quotas
> > > > > > > > are
> > > > > > > > > > > > applied
> > > > > > > > > > > > > >> to
> > > > > > > > > > > > > >> > >> > network
> > > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In the case
> > of
> > > > > fetch,
> > > > > > > > the
> > > > > > > > > > time
> > > > > > > > > > > > > >> spent in
> > > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant
> and
> > I
> > > > can
> > > > > > see
> > > > > > > > the
> > > > > > > > > > need
> > > > > > > > > > > > to
> > > > > > > > > > > > > >> > include
> > > > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > requests where
> > the
> > > > > > network
> > > > > > > > > > thread
> > > > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of fetch,
> > request
> > > > > > handler
> > > > > > > > > thread
> > > > > > > > > > > > > >> > utilization
> > > > > > > > > > > > > >> > >> > would
> > > > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > high request
> > rate,
> > > > low
> > > > > > > data
> > > > > > > > > > volume
> > > > > > > > > > > > and
> > > > > > > > > > > > > >> > fetch
> > > > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > clients with
> > high
> > > > data
> > > > > > > > volume.
> > > > > > > > > > > > Network
> > > > > > > > > > > > > >> > thread
> > > > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > proportional
> to
> > > the
> > > > > data
> > > > > > > > > > volume. I
> > > > > > > > > > > > am
> > > > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > based on
> network
> > > > > thread
> > > > > > > > > > > utilization
> > > > > > > > > > > > or
> > > > > > > > > > > > > >> > >> whether
> > > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At the
> > moment,
> > > we
> > > > > > > record
> > > > > > > > > and
> > > > > > > > > > > > check
> > > > > > > > > > > > > >> for
> > > > > > > > > > > > > >> > >> quota
> > > > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a quota is
> > > > > violated,
> > > > > > > the
> > > > > > > > > > > response
> > > > > > > > > > > > > is
> > > > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk reads for
> > > > fetches
> > > > > > > > > happening
> > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > >> > >> network
> > > > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delay a
> response
> > > > after
> > > > > > the
> > > > > > > > > disk
> > > > > > > > > > > > reads.
> > > > > > > > > > > > > >> We
> > > > > > > > > > > > > >> > >> could
> > > > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > the network
> > thread
> > > > > when
> > > > > > > the
> > > > > > > > > > > response
> > > > > > > > > > > > > is
> > > > > > > > > > > > > >> > >> complete
> > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling a
> > > > subsequent
> > > > > > > > request
> > > > > > > > > > > > > (separate
> > > > > > > > > > > > > >> out
> > > > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling in
> the
> > > case
> > > > > of
> > > > > > > > > network
> > > > > > > > > > > > thread
> > > > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb
> 21,
> > > 2017
> > > > > at
> > > > > > > 2:58
> > > > > > > > > AM,
> > > > > > > > > > > > > Becket
> > > > > > > > > > > > > >> > Qin <
> > > > > > > > > > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I
> agree
> > > that
> > > > > > > > enforcing
> > > > > > > > > > the
> > > > > > > > > > > > CPU
> > > > > > > > > > > > > >> time
> > > > > > > > > > > > > >> > >> is a
> > > > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > that maybe
> we
> > > can
> > > > > use
> > > > > > > the
> > > > > > > > > > > existing
> > > > > > > > > > > > > >> > request
> > > > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > very
> detailed
> > so
> > > > we
> > > > > > can
> > > > > > > > > > probably
> > > > > > > > > > > > see
> > > > > > > > > > > > > >> the
> > > > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > something
> like
> > > > > > > > (total_time -
> > > > > > > > > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I agree with
> > > > > Guozhang
> > > > > > > that
> > > > > > > > > > when
> > > > > > > > > > > a
> > > > > > > > > > > > > >> user is
> > > > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > need to see
> if
> > > > > > anything
> > > > > > > > has
> > > > > > > > > > went
> > > > > > > > > > > > > wrong
> > > > > > > > > > > > > >> > >> first,
> > > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > behaving and
> > > just
> > > > > need
> > > > > > > > more
> > > > > > > > > > > > > >> resources, we
> > > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for them. It
> > is
> > > > true
> > > > > > > that
> > > > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > users is
> > > > difficult.
> > > > > So
> > > > > > > in
> > > > > > > > > > > practice
> > > > > > > > > > > > > it
> > > > > > > > > > > > > >> > would
> > > > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a relative
> > high
> > > > > > > protective
> > > > > > > > > CPU
> > > > > > > > > > > > time
> > > > > > > > > > > > > >> quota
> > > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for some
> > > > individual
> > > > > > > > clients
> > > > > > > > > on
> > > > > > > > > > > > > demand.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie
> > > (Becket)
> > > > > Qin
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb
> > 20,
> > > > 2017
> > > > > > at
> > > > > > > > 5:48
> > > > > > > > > > PM,
> > > > > > > > > > > > > >> Guozhang
> > > > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > This is a
> > > great
> > > > > > > > proposal,
> > > > > > > > > > glad
> > > > > > > > > > > > to
> > > > > > > > > > > > > >> see
> > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am
> > inclined
> > > to
> > > > > the
> > > > > > > CPU
> > > > > > > > > > > > > >> throttling, or
> > > > > > > > > > > > > >> > >> more
> > > > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio
> > instead
> > > of
> > > > > the
> > > > > > > > > request
> > > > > > > > > > > > rate
> > > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > summed my
> > > > > rationales
> > > > > > > > > above,
> > > > > > > > > > > and
> > > > > > > > > > > > > one
> > > > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > has a good
> > > > support
> > > > > > for
> > > > > > > > > both
> > > > > > > > > > > > > >> "protecting
> > > > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> "utilizing a
> > > > > cluster
> > > > > > > for
> > > > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > explain
> this
> > > to
> > > > > the
> > > > > > > end
> > > > > > > > > > > users, I
> > > > > > > > > > > > > >> find
> > > > > > > > > > > > > >> > it
> > > > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > request
> rate
> > > > since
> > > > > > as
> > > > > > > > > > > mentioned
> > > > > > > > > > > > > >> above,
> > > > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > different
> > > > "cost",
> > > > > > and
> > > > > > > > > Kafka
> > > > > > > > > > > > today
> > > > > > > > > > > > > >> > already
> > > > > > > > > > > > > >> > >> > have
> > > > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > (produce,
> > > fetch,
> > > > > > > admin,
> > > > > > > > > > > > metadata,
> > > > > > > > > > > > > >> etc),
> > > > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttling
> > may
> > > > not
> > > > > > be
> > > > > > > as
> > > > > > > > > > > > effective
> > > > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> conservatively.
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > Regarding
> to
> > > > user
> > > > > > > > > reactions
> > > > > > > > > > > when
> > > > > > > > > > > > > >> they
> > > > > > > > > > > > > >> > are
> > > > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > case-by-case,
> > > > and
> > > > > > need
> > > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > > > >> > discovered /
> > > > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > metrics.
> So
> > in
> > > > > other
> > > > > > > > words
> > > > > > > > > > > users
> > > > > > > > > > > > > >> would
> > > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> information
> > by
> > > > > > simply
> > > > > > > > > being
> > > > > > > > > > > told
> > > > > > > > > > > > > >> "hey,
> > > > > > > > > > > > > >> > >> you
> > > > > > > > > > > > > >> > >> > are
> > > > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > what
> > > throttling
> > > > > > does;
> > > > > > > > they
> > > > > > > > > > > need
> > > > > > > > > > > > to
> > > > > > > > > > > > > >> > take a
> > > > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttled
> > > > probably
> > > > > > > > because
> > > > > > > > > > of
> > > > > > > > > > > > ..",
> > > > > > > > > > > > > >> > which
> > > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > > >> > >> > by
> > > > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > values:
> e.g.
> > > > > whether
> > > > > > > I'm
> > > > > > > > > > > > > bombarding
> > > > > > > > > > > > > >> the
> > > > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > > >>
> > > > > > > > > > > > > > ...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > [Message clipped]
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > *Todd Palino*
> > > > Staff Site Reliability Engineer
> > > > Data Infrastructure Streaming
> > > >
> > > >
> > > >
> > > > linkedin.com/in/toddpalino
> > > >
> > >
> >
> >
> >
> > --
> > *Todd Palino*
> > Staff Site Reliability Engineer
> > Data Infrastructure Streaming
> >
> >
> >
> > linkedin.com/in/toddpalino
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
I have updated the KIP to use "request.percentage" quotas where the
percentage is out of a total of (num.io.threads * 100). I have added the
other options considered so far under "Rejected Alternatives".

To address Todd's concern about per-thread quotas: Even though the quotas
are out of (num.io.threads * 100)  clients are not locked into threads.
Utilization is measured as the total across all the I/O threads and 10 %
quota can be 1% of 10 threads. Individual quotas can also be greater than
100% if required.

Please let me know if there are any other concerns or suggestions.

Thank you,

Rajini

On Wed, Mar 8, 2017 at 10:20 PM, Todd Palino <tp...@gmail.com> wrote:

> Rajini -
>
> I understand what you’re saying, but the point I’m making is that I don’t
> believe we need to take it into account directly. The CPU utilization of
> the network threads is directly proportional to the number of bytes being
> sent. The more bytes, the more CPU that is required for SSL (or other
> tasks). This is opposed to the request handler threads, where there are a
> number of factors that affect CPU utilization. This means that it’s not
> necessary to separately quota network thread byte usage and CPU - if we
> quota byte usage (which we already do), we have fixed the CPU usage at a
> proportional amount.
>
> Jun -
>
> Thanks for the clarification there. I was thinking of the utilization
> percentage as being fixed, not what the percentage reflects. I’m not tied
> to either way of doing it, provided that we do not lock clients to a single
> thread. For example, if I specify that a given client can use 10% of a
> single thread, that should also mean they can use 1% on 10 threads.
>
> -Todd
>
>
>
> On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Todd,
> >
> > Thanks for the feedback.
> >
> > I just want to clarify your second point. If the limit percentage is per
> > thread and the thread counts are changed, the absolute processing limit
> for
> > existing users haven't changed and there is no need to adjust them. On
> the
> > other hand, if the limit percentage is of total thread pool capacity and
> > the thread counts are changed, the effective processing limit for a user
> > will change. So, to preserve the current processing limit, existing user
> > limits have to be adjusted. If there is a hardware change, the effective
> > processing limit for a user will change in either approach and the
> existing
> > limit may need to be adjusted. However, hardware changes are less common
> > than thread pool configuration changes.
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <tp...@gmail.com> wrote:
> >
> > > I’ve been following this one on and off, and overall it sounds good to
> > me.
> > >
> > > - The SSL question is a good one. However, that type of overhead should
> > be
> > > proportional to the bytes rate, so I think that a bytes rate quota
> would
> > > still be a suitable way to address it.
> > >
> > > - I think it’s better to make the quota percentage of total thread pool
> > > capacity, and not percentage of an individual thread. That way you
> don’t
> > > have to adjust it when you adjust thread counts (tuning, hardware
> > changes,
> > > etc.)
> > >
> > >
> > > -Todd
> > >
> > >
> > >
> > > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <be...@gmail.com>
> wrote:
> > >
> > > > I see. Good point about SSL.
> > > >
> > > > I just asked Todd to take a look.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Hi, Jiangjie,
> > > > >
> > > > > Yes, I agree that byte rate already protects the network threads
> > > > > indirectly. I am not sure if byte rate fully captures the CPU
> > overhead
> > > in
> > > > > network due to SSL. So, at the high level, we can use request time
> > > limit
> > > > to
> > > > > protect CPU and use byte rate to protect storage and network.
> > > > >
> > > > > Also, do you think you can get Todd to comment on this KIP?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <be...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Hi Rajini/Jun,
> > > > > >
> > > > > > The percentage based reasoning sounds good.
> > > > > > One thing I am wondering is that if we assume the network thread
> > are
> > > > just
> > > > > > doing the network IO, can we say bytes rate quota is already sort
> > of
> > > > > > network threads quota?
> > > > > > If we take network threads into the consideration here, would
> that
> > be
> > > > > > somewhat overlapping with the bytes rate quota?
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Jun,
> > > > > > >
> > > > > > > Thank you for the explanation, I hadn't realized you meant
> > > percentage
> > > > > of
> > > > > > > the total thread pool. If everyone is OK with Jun's
> suggestion, I
> > > > will
> > > > > > > update the KIP.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Rajini
> > > > > > >
> > > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > > > > >
> > > > > > > > Hi, Rajini,
> > > > > > > >
> > > > > > > > Let's take your example. Let's say a user sets the limit to
> > 50%.
> > > I
> > > > am
> > > > > > not
> > > > > > > > sure if it's better to apply the same percentage separately
> to
> > > > > network
> > > > > > > and
> > > > > > > > io thread pool. For example, for produce requests, most of
> the
> > > time
> > > > > > will
> > > > > > > be
> > > > > > > > spent in the io threads whereas for fetch requests, most of
> the
> > > > time
> > > > > > will
> > > > > > > > be in the network threads. So, using the same percentage in
> > both
> > > > > thread
> > > > > > > > pools means one of the pools' resource will be over
> allocated.
> > > > > > > >
> > > > > > > > An alternative way is to simply model network and io thread
> > pool
> > > > > > > together.
> > > > > > > > If you get 10 io threads and 5 network threads, you get 1500%
> > > > request
> > > > > > > > processing power. A 50% limit means a total of 750%
> processing
> > > > power.
> > > > > > We
> > > > > > > > just add up the time a user request spent in either network
> or
> > io
> > > > > > thread.
> > > > > > > > If that total exceeds 750% (doesn't matter whether it's spent
> > > more
> > > > in
> > > > > > > > network or io thread), the request will be throttled. This
> > seems
> > > > more
> > > > > > > > general and is not sensitive to the current implementation
> > detail
> > > > of
> > > > > > > having
> > > > > > > > a separate network and io thread pool. In the future, if the
> > > > > threading
> > > > > > > > model changes, the same concept of quota can still be
> applied.
> > > For
> > > > > now,
> > > > > > > > since it's a bit tricky to add the delay logic in the network
> > > > thread
> > > > > > > pool,
> > > > > > > > we could probably just do the delaying only in the io threads
> > as
> > > > you
> > > > > > > > suggested earlier.
> > > > > > > >
> > > > > > > > There is still the orthogonal question of whether a quota of
> > 50%
> > > is
> > > > > out
> > > > > > > of
> > > > > > > > 100% or 100% * #total processing threads. My feeling is that
> > the
> > > > > latter
> > > > > > > is
> > > > > > > > slightly better based on my explanation earlier. The way to
> > > > describe
> > > > > > this
> > > > > > > > quota to the users can be "share of elapsed request
> processing
> > > time
> > > > > on
> > > > > > a
> > > > > > > > single CPU" (similar to top).
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > > > > rajinisivaram@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Jun,
> > > > > > > > >
> > > > > > > > > Agree about the two scenarios.
> > > > > > > > >
> > > > > > > > > But still not sure about a single quota covering both
> network
> > > > > threads
> > > > > > > and
> > > > > > > > > I/O threads with per-thread quota. If there are 10 I/O
> > threads
> > > > and
> > > > > 5
> > > > > > > > > network threads and I want to assign half the quota to
> userA,
> > > the
> > > > > > quota
> > > > > > > > > would be 750%. I imagine, internally, we would convert this
> > to
> > > > 500%
> > > > > > for
> > > > > > > > I/O
> > > > > > > > > and 250% for network threads to allocate 50% of each pool.
> > > > > > > > >
> > > > > > > > > A couple of scenarios:
> > > > > > > > >
> > > > > > > > > 1. Admin adds 1 extra network thread. To retain 50%, admin
> > > needs
> > > > to
> > > > > > now
> > > > > > > > > allocate 800% for each user. Or increase the quota for a
> few
> > > > users.
> > > > > > To
> > > > > > > > me,
> > > > > > > > > it feels like admin needs to convert 50% to 800% and Kafka
> > > > > internally
> > > > > > > > needs
> > > > > > > > > to convert 800% to (500%, 300%). Everyone using just 50%
> > feels
> > > a
> > > > > lot
> > > > > > > > > simpler.
> > > > > > > > >
> > > > > > > > > 2. We decide to add some other thread to this list. Admin
> > needs
> > > > to
> > > > > > know
> > > > > > > > > exactly how many threads form the maximum quota. And we can
> > be
> > > > > > changing
> > > > > > > > > this between broker versions as we add more to the list.
> > Again
> > > a
> > > > > > single
> > > > > > > > > overall percent would be a lot simpler.
> > > > > > > > >
> > > > > > > > > There were others who were unconvinced by a single percent
> > from
> > > > the
> > > > > > > > initial
> > > > > > > > > proposal and were happier with thread units similar to CPU
> > > units,
> > > > > so
> > > > > > I
> > > > > > > am
> > > > > > > > > ok with going with per-thread quotas (as units or percent).
> > > Just
> > > > > not
> > > > > > > sure
> > > > > > > > > it makes it easier for admin in all cases.
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > >
> > > > > > > > > Rajini
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Rajini,
> > > > > > > > > >
> > > > > > > > > > Consider modeling as n * 100% unit. For 2), the question
> is
> > > > > what's
> > > > > > > > > causing
> > > > > > > > > > the I/O threads to be saturated. It's unlikely that all
> > > users'
> > > > > > > > > utilization
> > > > > > > > > > have increased at the same. A more likely case is that a
> > few
> > > > > > isolated
> > > > > > > > > > users' utilization have increased. If so, after
> increasing
> > > the
> > > > > > number
> > > > > > > > of
> > > > > > > > > > threads, the admin just needs to adjust the quota for a
> few
> > > > > > isolated
> > > > > > > > > users,
> > > > > > > > > > which is expected and is less work.
> > > > > > > > > >
> > > > > > > > > > Consider modeling as 1 * 100% unit. For 1), all users'
> > quota
> > > > need
> > > > > > to
> > > > > > > be
> > > > > > > > > > adjusted, which is unexpected and is more work.
> > > > > > > > > >
> > > > > > > > > > So, to me, the n * 100% model seems more convenient.
> > > > > > > > > >
> > > > > > > > > > As for future extension to cover network thread
> > utilization,
> > > I
> > > > > was
> > > > > > > > > thinking
> > > > > > > > > > that one way is to simply model the capacity as (n + m) *
> > > 100%
> > > > > > unit,
> > > > > > > > > where
> > > > > > > > > > n and m are the number of network and i/o threads,
> > > > respectively.
> > > > > > > Then,
> > > > > > > > > for
> > > > > > > > > > each user, we can just add up the utilization in the
> > network
> > > > and
> > > > > > the
> > > > > > > > i/o
> > > > > > > > > > thread. If we do this, we don't need a new type of quota.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Jun,
> > > > > > > > > > >
> > > > > > > > > > > If we use request.percentage as the percentage used in
> a
> > > > single
> > > > > > I/O
> > > > > > > > > > thread,
> > > > > > > > > > > the total percentage being allocated will be
> > > num.io.threads *
> > > > > 100
> > > > > > > for
> > > > > > > > > I/O
> > > > > > > > > > > threads and num.network.threads * 100 for network
> > threads.
> > > A
> > > > > > single
> > > > > > > > > quota
> > > > > > > > > > > covering the two as a percentage wouldn't quite work if
> > you
> > > > > want
> > > > > > to
> > > > > > > > > > > allocate the same proportion in both cases. If we want
> to
> > > > treat
> > > > > > > > threads
> > > > > > > > > > as
> > > > > > > > > > > separate units, won't we need two quota configurations
> > > > > regardless
> > > > > > > of
> > > > > > > > > > > whether we use units or percentage? Perhaps I
> > misunderstood
> > > > > your
> > > > > > > > > > > suggestion.
> > > > > > > > > > >
> > > > > > > > > > > I think there are two cases:
> > > > > > > > > > >
> > > > > > > > > > >    1. The use case that you mentioned where an admin is
> > > > adding
> > > > > > more
> > > > > > > > > users
> > > > > > > > > > >    and decides to add more I/O threads and expects to
> > find
> > > > free
> > > > > > > quota
> > > > > > > > > to
> > > > > > > > > > >    allocate for new users.
> > > > > > > > > > >    2. Admin adds more I/O threads because the I/O
> threads
> > > are
> > > > > > > > saturated
> > > > > > > > > > and
> > > > > > > > > > >    there are cores available to allocate, even though
> the
> > > > > number
> > > > > > or
> > > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > > >
> > > > > > > > > > > If we allocated treated I/O threads as a single unit of
> > > 100%,
> > > > > all
> > > > > > > > user
> > > > > > > > > > > quotas need to be reallocated for 1). If we allocated
> I/O
> > > > > threads
> > > > > > > as
> > > > > > > > n
> > > > > > > > > > > units with n*100%, all user quotas need to be
> reallocated
> > > for
> > > > > 2),
> > > > > > > > > > otherwise
> > > > > > > > > > > some of the new threads may just not be used. Either
> way
> > it
> > > > > > should
> > > > > > > be
> > > > > > > > > > easy
> > > > > > > > > > > to write a script to decrease/increase quotas by a
> > multiple
> > > > for
> > > > > > all
> > > > > > > > > > users.
> > > > > > > > > > >
> > > > > > > > > > > So it really boils down to which quota unit is most
> > > intuitive
> > > > > in
> > > > > > > > terms
> > > > > > > > > of
> > > > > > > > > > > configuration. And from the discussion so far, it feels
> > > like
> > > > > > > opinion
> > > > > > > > is
> > > > > > > > > > > divided on whether quotas should be carved out of an
> > > absolute
> > > > > > 100%
> > > > > > > > (or
> > > > > > > > > 1
> > > > > > > > > > > unit) or be relative to the number of threads (n*100%
> or
> > n
> > > > > > units).
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <
> > jun@confluent.io>
> > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Another way to express an absolute limit is to use
> > > > > > > > > request.percentage,
> > > > > > > > > > > but
> > > > > > > > > > > > treat it as the percentage used in a single request
> > > > handling
> > > > > > > > thread.
> > > > > > > > > > For
> > > > > > > > > > > > now, the request handling threads can be just the io
> > > > threads.
> > > > > > In
> > > > > > > > the
> > > > > > > > > > > > future, they can cover the network threads as well.
> > This
> > > is
> > > > > > > similar
> > > > > > > > > to
> > > > > > > > > > > how
> > > > > > > > > > > > top reports CPU usage and may be a bit easier for
> > people
> > > to
> > > > > > > > > understand.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jun
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <
> > > > jun@confluent.io>
> > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. Regarding request.unit vs request.percentage. I
> > > > started
> > > > > > with
> > > > > > > > > > > > > request.percentage too. The reasoning for
> > request.unit
> > > is
> > > > > the
> > > > > > > > > > > following.
> > > > > > > > > > > > > Suppose that the capacity has been reached on a
> > broker
> > > > and
> > > > > > the
> > > > > > > > > admin
> > > > > > > > > > > > needs
> > > > > > > > > > > > > to add a new user. A simple way to increase the
> > > capacity
> > > > is
> > > > > > to
> > > > > > > > > > increase
> > > > > > > > > > > > the
> > > > > > > > > > > > > number of io threads, assuming there are still
> enough
> > > > > cores.
> > > > > > If
> > > > > > > > the
> > > > > > > > > > > limit
> > > > > > > > > > > > > is based on percentage, the additional capacity
> > > > > automatically
> > > > > > > > gets
> > > > > > > > > > > > > distributed to existing users and we haven't really
> > > > carved
> > > > > > out
> > > > > > > > any
> > > > > > > > > > > > > additional resource for the new user. Now, is it
> easy
> > > > for a
> > > > > > > user
> > > > > > > > to
> > > > > > > > > > > > reason
> > > > > > > > > > > > > about 0.1 unit vs 10%. My feeling is that both are
> > hard
> > > > and
> > > > > > > have
> > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > > configured empirically. Not sure if percentage is
> > > > obviously
> > > > > > > > easier
> > > > > > > > > to
> > > > > > > > > > > > > reason about.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jun
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <
> > > > > jay@confluent.io
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> 1. Even though the implementation of this quota is
> > > only
> > > > > > using
> > > > > > > io
> > > > > > > > > > > thread
> > > > > > > > > > > > >> time, i think we should call it something like
> > > > > > "request-time".
> > > > > > > > > This
> > > > > > > > > > > will
> > > > > > > > > > > > >> give us flexibility to improve the implementation
> to
> > > > cover
> > > > > > > > network
> > > > > > > > > > > > threads
> > > > > > > > > > > > >> in the future and will avoid exposing internal
> > details
> > > > > like
> > > > > > > our
> > > > > > > > > > thread
> > > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> 2. Jun/Roger, I get what you are trying to fix but
> > the
> > > > > idea
> > > > > > of
> > > > > > > > > > > > >> thread/units
> > > > > > > > > > > > >> is super unintuitive as a user-facing knob. I had
> to
> > > > read
> > > > > > the
> > > > > > > > KIP
> > > > > > > > > > like
> > > > > > > > > > > > >> eight times to understand this. I'm not sure that
> > your
> > > > > point
> > > > > > > > that
> > > > > > > > > > > > >> increasing the number of threads is a problem
> with a
> > > > > > > > > > percentage-based
> > > > > > > > > > > > >> value, it really depends on whether the user
> thinks
> > > > about
> > > > > > the
> > > > > > > > > > > > "percentage
> > > > > > > > > > > > >> of request processing time" or "thread units". If
> > they
> > > > > think
> > > > > > > "I
> > > > > > > > > have
> > > > > > > > > > > > >> allocated 10% of my request processing time to
> user
> > x"
> > > > > then
> > > > > > it
> > > > > > > > is
> > > > > > > > > a
> > > > > > > > > > > bug
> > > > > > > > > > > > >> that increasing the thread count decreases that
> > > percent
> > > > as
> > > > > > it
> > > > > > > > does
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > >> current proposal. As a practical matter I think
> the
> > > only
> > > > > way
> > > > > > > to
> > > > > > > > > > > actually
> > > > > > > > > > > > >> reason about this is as a percent---I just don't
> > > believe
> > > > > > > people
> > > > > > > > > are
> > > > > > > > > > > > going
> > > > > > > > > > > > >> to think, "ah, 4.3 thread units, that is the right
> > > > > amount!".
> > > > > > > > > > Instead I
> > > > > > > > > > > > >> think they have to understand this thread unit
> > > concept,
> > > > > > figure
> > > > > > > > out
> > > > > > > > > > > what
> > > > > > > > > > > > >> they have set in number of threads, compute a
> > percent
> > > > and
> > > > > > then
> > > > > > > > > come
> > > > > > > > > > up
> > > > > > > > > > > > >> with
> > > > > > > > > > > > >> the number of thread units, and these will all be
> > > wrong
> > > > if
> > > > > > > that
> > > > > > > > > > thread
> > > > > > > > > > > > >> count changes. I also think this ties us to
> > throttling
> > > > the
> > > > > > I/O
> > > > > > > > > > thread
> > > > > > > > > > > > >> pool,
> > > > > > > > > > > > >> which may not be where we want to end up.
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> 3. For what it's worth I do think having a single
> > > > > > throttle_ms
> > > > > > > > > field
> > > > > > > > > > in
> > > > > > > > > > > > all
> > > > > > > > > > > > >> the responses that combines all throttling from
> all
> > > > quotas
> > > > > > is
> > > > > > > > > > probably
> > > > > > > > > > > > the
> > > > > > > > > > > > >> simplest. There could be a use case for having
> > > separate
> > > > > > fields
> > > > > > > > for
> > > > > > > > > > > each,
> > > > > > > > > > > > >> but I think that is actually harder to use/monitor
> > in
> > > > the
> > > > > > > common
> > > > > > > > > > case
> > > > > > > > > > > so
> > > > > > > > > > > > >> unless someone has a use case I think just one
> > should
> > > be
> > > > > > fine.
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> -Jay
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > >>
> > > > > > > > > > > > >> > I have updated the KIP based on the discussions
> so
> > > > far.
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Regards,
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > Rajini
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini
> Sivaram <
> > > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Ismael #1. It makes sense not to throttle
> > > > inter-broker
> > > > > > > > > requests
> > > > > > > > > > > like
> > > > > > > > > > > > >> > > LeaderAndIsr etc. The simplest way to ensure
> > that
> > > > > > clients
> > > > > > > > > cannot
> > > > > > > > > > > use
> > > > > > > > > > > > >> > these
> > > > > > > > > > > > >> > > requests to bypass quotas for DoS attacks is
> to
> > > > ensure
> > > > > > > that
> > > > > > > > > ACLs
> > > > > > > > > > > > >> prevent
> > > > > > > > > > > > >> > > clients from using these requests and
> > unauthorized
> > > > > > > requests
> > > > > > > > > are
> > > > > > > > > > > > >> included
> > > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking that these
> > > quotas
> > > > > can
> > > > > > > > > return
> > > > > > > > > > a
> > > > > > > > > > > > >> > separate
> > > > > > > > > > > > >> > > throttle time, and all utilization based
> quotas
> > > > could
> > > > > > use
> > > > > > > > the
> > > > > > > > > > same
> > > > > > > > > > > > >> field
> > > > > > > > > > > > >> > > (we won't add another one for network thread
> > > > > utilization
> > > > > > > for
> > > > > > > > > > > > >> instance).
> > > > > > > > > > > > >> > But
> > > > > > > > > > > > >> > > perhaps it makes sense to keep byte rate
> quotas
> > > > > separate
> > > > > > > in
> > > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > > >> > > responses to provide separate metrics? Agree
> > with
> > > > > Ismael
> > > > > > > > that
> > > > > > > > > > the
> > > > > > > > > > > > >> name of
> > > > > > > > > > > > >> > > the existing field should be changed if we
> have
> > > two.
> > > > > > Happy
> > > > > > > > to
> > > > > > > > > > > switch
> > > > > > > > > > > > >> to a
> > > > > > > > > > > > >> > > single combined throttle time if that is
> > > sufficient.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use
> dot
> > > > > > separated
> > > > > > > > > name
> > > > > > > > > > > for
> > > > > > > > > > > > >> new
> > > > > > > > > > > > >> > > property. Replication quotas use dot
> separated,
> > so
> > > > it
> > > > > > will
> > > > > > > > be
> > > > > > > > > > > > >> consistent
> > > > > > > > > > > > >> > > with all properties except byte rate quotas.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Radai: #1 Request processing time rather than
> > > > request
> > > > > > rate
> > > > > > > > > were
> > > > > > > > > > > > chosen
> > > > > > > > > > > > >> > > because the time per request can vary
> > > significantly
> > > > > > > between
> > > > > > > > > > > requests
> > > > > > > > > > > > >> as
> > > > > > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > > > > > >> > > #2 Two separate quotas for heartbeats/regular
> > > > requests
> > > > > > > feel
> > > > > > > > > like
> > > > > > > > > > > > more
> > > > > > > > > > > > >> > > configuration and more metrics. Since most
> users
> > > > would
> > > > > > set
> > > > > > > > > > quotas
> > > > > > > > > > > > >> higher
> > > > > > > > > > > > >> > > than the expected usage and quotas are more
> of a
> > > > > safety
> > > > > > > > net, a
> > > > > > > > > > > > single
> > > > > > > > > > > > >> > quota
> > > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > > >> > >  #3 The number of requests in purgatory is
> > limited
> > > > by
> > > > > > the
> > > > > > > > > number
> > > > > > > > > > > of
> > > > > > > > > > > > >> > active
> > > > > > > > > > > > >> > > connections since only one request per
> > connection
> > > > will
> > > > > > be
> > > > > > > > > > > throttled
> > > > > > > > > > > > >> at a
> > > > > > > > > > > > >> > > time.
> > > > > > > > > > > > >> > > #4 As with byte rate quotas, to use the full
> > > > allocated
> > > > > > > > quotas,
> > > > > > > > > > > > >> > > clients/users would need to use partitions
> that
> > > are
> > > > > > > > > distributed
> > > > > > > > > > > > across
> > > > > > > > > > > > >> > the
> > > > > > > > > > > > >> > > cluster. The alternative of using cluster-wide
> > > > quotas
> > > > > > > > instead
> > > > > > > > > of
> > > > > > > > > > > > >> > per-broker
> > > > > > > > > > > > >> > > quotas would be far too complex to implement.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Dong : We currently have two
> ClientQuotaManagers
> > > for
> > > > > > quota
> > > > > > > > > types
> > > > > > > > > > > > Fetch
> > > > > > > > > > > > >> > and
> > > > > > > > > > > > >> > > Produce. A new one will be added for IOThread,
> > > which
> > > > > > > manages
> > > > > > > > > > > quotas
> > > > > > > > > > > > >> for
> > > > > > > > > > > > >> > I/O
> > > > > > > > > > > > >> > > thread utilization. This will not update the
> > Fetch
> > > > or
> > > > > > > > Produce
> > > > > > > > > > > > >> queue-size,
> > > > > > > > > > > > >> > > but will have a separate metric for the
> > > > queue-size.  I
> > > > > > > > wasn't
> > > > > > > > > > > > >> planning to
> > > > > > > > > > > > >> > > add any additional metrics apart from the
> > > equivalent
> > > > > > ones
> > > > > > > > for
> > > > > > > > > > > > existing
> > > > > > > > > > > > >> > > quotas as part of this KIP. Ratio of byte-rate
> > to
> > > > I/O
> > > > > > > thread
> > > > > > > > > > > > >> utilization
> > > > > > > > > > > > >> > > could be slightly misleading since it depends
> on
> > > the
> > > > > > > > sequence
> > > > > > > > > of
> > > > > > > > > > > > >> > requests.
> > > > > > > > > > > > >> > > But we can look into more metrics after the
> KIP
> > is
> > > > > > > > implemented
> > > > > > > > > > if
> > > > > > > > > > > > >> > required.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > I think we need to limit the maximum delay
> since
> > > all
> > > > > > > > requests
> > > > > > > > > > are
> > > > > > > > > > > > >> > > throttled. If a client has a quota of 0.001
> > units
> > > > and
> > > > > a
> > > > > > > > single
> > > > > > > > > > > > request
> > > > > > > > > > > > >> > used
> > > > > > > > > > > > >> > > 50ms, we don't want to delay all requests from
> > the
> > > > > > client
> > > > > > > by
> > > > > > > > > 50
> > > > > > > > > > > > >> seconds,
> > > > > > > > > > > > >> > > throwing the client out of all its consumer
> > > groups.
> > > > > The
> > > > > > > > issue
> > > > > > > > > is
> > > > > > > > > > > > only
> > > > > > > > > > > > >> if
> > > > > > > > > > > > >> > a
> > > > > > > > > > > > >> > > user is allocated a quota that is insufficient
> > to
> > > > > > process
> > > > > > > > one
> > > > > > > > > > > large
> > > > > > > > > > > > >> > > request. The expectation is that the units
> > > allocated
> > > > > per
> > > > > > > > user
> > > > > > > > > > will
> > > > > > > > > > > > be
> > > > > > > > > > > > >> > much
> > > > > > > > > > > > >> > > higher than the time taken to process one
> > request
> > > > and
> > > > > > the
> > > > > > > > > limit
> > > > > > > > > > > > should
> > > > > > > > > > > > >> > > seldom be applied. Agree this needs proper
> > > > > > documentation.
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > >> > >
> > > > > > > > > > > > >> > >> @jun: i wasnt concerned about tying up a
> > request
> > > > > > > processing
> > > > > > > > > > > thread,
> > > > > > > > > > > > >> but
> > > > > > > > > > > > >> > >> IIUC the code does still read the entire
> > request
> > > > out,
> > > > > > > which
> > > > > > > > > > might
> > > > > > > > > > > > >> add-up
> > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > > > > > >> > >>
> > > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > >> > >>
> > > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > >> > >> > The current KIP says that the maximum delay
> > > will
> > > > be
> > > > > > > > reduced
> > > > > > > > > > to
> > > > > > > > > > > > >> window
> > > > > > > > > > > > >> > >> size
> > > > > > > > > > > > >> > >> > if it is larger than the window size. I
> have
> > a
> > > > > > concern
> > > > > > > > with
> > > > > > > > > > > this:
> > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > >> > >> > 1) This essentially means that the user is
> > > > allowed
> > > > > to
> > > > > > > > > exceed
> > > > > > > > > > > > their
> > > > > > > > > > > > >> > quota
> > > > > > > > > > > > >> > >> > over a long period of time. Can you provide
> > an
> > > > > upper
> > > > > > > > bound
> > > > > > > > > on
> > > > > > > > > > > > this
> > > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > >> > >> > 2) What is the motivation for cap the
> maximum
> > > > delay
> > > > > > by
> > > > > > > > the
> > > > > > > > > > > window
> > > > > > > > > > > > >> > size?
> > > > > > > > > > > > >> > >> I
> > > > > > > > > > > > >> > >> > am wondering if there is better alternative
> > to
> > > > > > address
> > > > > > > > the
> > > > > > > > > > > > problem.
> > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > >> > >> > 3) It means that the existing
> metric-related
> > > > config
> > > > > > > will
> > > > > > > > > > have a
> > > > > > > > > > > > >> more
> > > > > > > > > > > > >> > >> > directly impact on the mechanism of this
> > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > >> quota.
> > > > > > > > > > > > >> > The
> > > > > > > > > > > > >> > >> > may be an important change depending on the
> > > > answer
> > > > > to
> > > > > > > 1)
> > > > > > > > > > above.
> > > > > > > > > > > > We
> > > > > > > > > > > > >> > >> probably
> > > > > > > > > > > > >> > >> > need to document this more explicitly.
> > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin
> <
> > > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > > >> > wrote:
> > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > >> > >> > > Yeah you are right. I thought it wasn't
> > > because
> > > > > at
> > > > > > > > > LinkedIn
> > > > > > > > > > > it
> > > > > > > > > > > > >> will
> > > > > > > > > > > > >> > be
> > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > >> > >> > > much pressure on inGraph to expose those
> > > > > > per-clientId
> > > > > > > > > > metrics
> > > > > > > > > > > > so
> > > > > > > > > > > > >> we
> > > > > > > > > > > > >> > >> ended
> > > > > > > > > > > > >> > >> > > up printing them periodically to local
> log.
> > > > Never
> > > > > > > mind
> > > > > > > > if
> > > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > >> not
> > > > > > > > > > > > >> > a
> > > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > >> > >> > > - I agree with Jay that we probably don't
> > > want
> > > > to
> > > > > > > add a
> > > > > > > > > new
> > > > > > > > > > > > field
> > > > > > > > > > > > >> > for
> > > > > > > > > > > > >> > >> > > every quota ProduceResponse or
> > FetchResponse.
> > > > Is
> > > > > > > there
> > > > > > > > > any
> > > > > > > > > > > > >> use-case
> > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > >> > >> > > having separate throttle-time fields for
> > > > > > > > byte-rate-quota
> > > > > > > > > > and
> > > > > > > > > > > > >> > >> > > io-thread-unit-quota? You probably need
> to
> > > > > document
> > > > > > > > this
> > > > > > > > > as
> > > > > > > > > > > > >> > interface
> > > > > > > > > > > > >> > >> > > change if you plan to add new field in
> any
> > > > > request.
> > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > >> > >> > > - I don't think IOThread belongs to
> > > quotaType.
> > > > > The
> > > > > > > > > existing
> > > > > > > > > > > > quota
> > > > > > > > > > > > >> > >> types
> > > > > > > > > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > > > > > > > > n/FollowerReplication)
> > > > > > > > > > > > >> identify
> > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > >> > >> > > type of request that are throttled, not
> the
> > > > quota
> > > > > > > > > mechanism
> > > > > > > > > > > > that
> > > > > > > > > > > > >> is
> > > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > >> > >> > > - If a request is throttled due to this
> > > > > > > > > > io-thread-unit-based
> > > > > > > > > > > > >> quota,
> > > > > > > > > > > > >> > is
> > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > >> > >> > > existing queue-size metric in
> > > > ClientQuotaManager
> > > > > > > > > > incremented?
> > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > >> > >> > > - In the interest of providing guide line
> > for
> > > > > admin
> > > > > > > to
> > > > > > > > > > decide
> > > > > > > > > > > > >> > >> > > io-thread-unit-based quota and for user
> to
> > > > > > understand
> > > > > > > > its
> > > > > > > > > > > > impact
> > > > > > > > > > > > >> on
> > > > > > > > > > > > >> > >> their
> > > > > > > > > > > > >> > >> > > traffic, would it be useful to have a
> > metric
> > > > that
> > > > > > > shows
> > > > > > > > > the
> > > > > > > > > > > > >> overall
> > > > > > > > > > > > >> > >> > > byte-rate per io-thread-unit? Can we also
> > > show
> > > > > > this a
> > > > > > > > > > > > >> per-clientId
> > > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao
> <
> > > > > > > > > jun@confluent.io
> > > > > > > > > > >
> > > > > > > > > > > > >> wrote:
> > > > > > > > > > > > >> > >> > >
> > > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > >> > >> > >> For #3, typically, an admin won't
> > configure
> > > > more
> > > > > > io
> > > > > > > > > > threads
> > > > > > > > > > > > than
> > > > > > > > > > > > >> > CPU
> > > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > > >> > >> > >> but it's possible for an admin to start
> > with
> > > > > fewer
> > > > > > > io
> > > > > > > > > > > threads
> > > > > > > > > > > > >> than
> > > > > > > > > > > > >> > >> cores
> > > > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > >> > >> > >> I think the throttleTime sensor on the
> > > broker
> > > > > > tells
> > > > > > > > the
> > > > > > > > > > > admin
> > > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > >> > >> > >> The reasoning for delaying the throttled
> > > > > requests
> > > > > > on
> > > > > > > > the
> > > > > > > > > > > > broker
> > > > > > > > > > > > >> > >> instead
> > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > >> > >> > >> returning an error immediately is that
> the
> > > > > latter
> > > > > > > has
> > > > > > > > no
> > > > > > > > > > way
> > > > > > > > > > > > to
> > > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > >> > >> > >> client from retrying immediately, which
> > will
> > > > > make
> > > > > > > > things
> > > > > > > > > > > > worse.
> > > > > > > > > > > > >> The
> > > > > > > > > > > > >> > >> > >> delaying logic is based off a delay
> > queue. A
> > > > > > > separate
> > > > > > > > > > > > expiration
> > > > > > > > > > > > >> > >> thread
> > > > > > > > > > > > >> > >> > >> just waits on the next to be expired
> > > request.
> > > > > So,
> > > > > > it
> > > > > > > > > > doesn't
> > > > > > > > > > > > tie
> > > > > > > > > > > > >> > up a
> > > > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael
> > > Juma <
> > > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > > >> >
> > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > >> > >> > >> > Regarding 1, I definitely like the
> > > > simplicity
> > > > > of
> > > > > > > > > > keeping a
> > > > > > > > > > > > >> single
> > > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > > >> > >> > >> > time field in the response. The
> downside
> > > is
> > > > > that
> > > > > > > the
> > > > > > > > > > > client
> > > > > > > > > > > > >> > metrics
> > > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > > percentage`
> > > > > > > > > > > > >> > and
> > > > > > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay
> > > Kreps <
> > > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > >> > >> > >> > >    1. Isn't it the case that the
> > > > throttling
> > > > > > time
> > > > > > > > > > > response
> > > > > > > > > > > > >> field
> > > > > > > > > > > > >> > >> > should
> > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > >> > >> > >> > >    the total time your request was
> > > > throttled
> > > > > > > > > > > irrespective
> > > > > > > > > > > > of
> > > > > > > > > > > > >> > the
> > > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > > >> > >> > >> > >    caused that. Limiting it to byte
> > rate
> > > > > quota
> > > > > > > > > doesn't
> > > > > > > > > > > > make
> > > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > >> > >> > >> > >    I don't think we want to end up
> > > adding
> > > > > new
> > > > > > > > fields
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > >> response
> > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > > > > > > > > >> > >> > >> > >    2. I don't think we should make
> > this
> > > > > quota
> > > > > > > > > > > specifically
> > > > > > > > > > > > >> > about
> > > > > > > > > > > > >> > >> io
> > > > > > > > > > > > >> > >> > >> > >    threads. Once we introduce these
> > > quotas
> > > > > > > people
> > > > > > > > > set
> > > > > > > > > > > them
> > > > > > > > > > > > >> and
> > > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > >> > >> > >> > >    be enforced (and if they aren't
> it
> > > may
> > > > > > cause
> > > > > > > an
> > > > > > > > > > > > outage).
> > > > > > > > > > > > >> As
> > > > > > > > > > > > >> > a
> > > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > > >> > >> > >> > >    are a bit more sensitive than
> > normal
> > > > > > > configs, I
> > > > > > > > > > > think.
> > > > > > > > > > > > >> The
> > > > > > > > > > > > >> > >> > current
> > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > >> > >> > >> > >    pools seem like something of an
> > > > > > > implementation
> > > > > > > > > > detail
> > > > > > > > > > > > and
> > > > > > > > > > > > >> > not
> > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > >> > >> > >> > >    user-facing quotas should be
> > involved
> > > > > > with. I
> > > > > > > > > think
> > > > > > > > > > > it
> > > > > > > > > > > > >> might
> > > > > > > > > > > > >> > >> be
> > > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > >> > >> > >> > >    make this a general request-time
> > > > throttle
> > > > > > > with
> > > > > > > > no
> > > > > > > > > > > > >> mention in
> > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > > >> > >> > >> > >    about I/O threads and simply
> > > > acknowledge
> > > > > > the
> > > > > > > > > > current
> > > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > > >> > >> > >> > >    may someday fix) in the docs that
> > > this
> > > > > > covers
> > > > > > > > > only
> > > > > > > > > > > the
> > > > > > > > > > > > >> time
> > > > > > > > > > > > >> > >> after
> > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > >> > >> > >> > >    thread is read off the network.
> > > > > > > > > > > > >> > >> > >> > >    3. As such I think the right
> > > interface
> > > > to
> > > > > > the
> > > > > > > > > user
> > > > > > > > > > > > would
> > > > > > > > > > > > >> be
> > > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > > >> > >> > >> > >    like percent_request_time and be
> in
> > > > > > > {0,...100}
> > > > > > > > or
> > > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio"
> > is
> > > > the
> > > > > > > > > > terminology
> > > > > > > > > > > we
> > > > > > > > > > > > >> used
> > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > > >> > >> > >> > >    is between 0 and 1 in the other
> > > > metrics,
> > > > > > > > right?)
> > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM,
> > Rajini
> > > > > > Sivaram
> > > > > > > <
> > > > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > >> > >> > >> > > > Guozhang : I have updated the
> > section
> > > on
> > > > > > > > > > co-existence
> > > > > > > > > > > of
> > > > > > > > > > > > >> byte
> > > > > > > > > > > > >> > >> rate
> > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added much detail
> to
> > > the
> > > > > > > metrics
> > > > > > > > > and
> > > > > > > > > > > > >> sensors
> > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > > >> > >> > >> > > > going to be very similar to the
> > > existing
> > > > > > > metrics
> > > > > > > > > and
> > > > > > > > > > > > >> sensors.
> > > > > > > > > > > > >> > >> To
> > > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > > >> > >> > >> > > > confusion, I have now added more
> > > detail.
> > > > > All
> > > > > > > > > metrics
> > > > > > > > > > > are
> > > > > > > > > > > > >> in
> > > > > > > > > > > > >> > the
> > > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > > >> > >> > >> > > > "quotaType" and all sensors have
> > names
> > > > > > > starting
> > > > > > > > > with
> > > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/
> > > > > > LeaderReplication/
> > > > > > > > > > > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > > > > > > > > > > >> > >> > >> > > > So there will be no reuse of
> > existing
> > > > > > > > > > metrics/sensors.
> > > > > > > > > > > > The
> > > > > > > > > > > > >> > new
> > > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > > >> > >> > >> > > > request processing time based
> > > throttling
> > > > > > will
> > > > > > > be
> > > > > > > > > > > > >> completely
> > > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > >> > >> > >> > > > existing metrics/sensors, but will
> > be
> > > > > > > consistent
> > > > > > > > > in
> > > > > > > > > > > > >> format.
> > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > >> > >> > >> > > > The existing throttle_time_ms
> field
> > in
> > > > > > > > > produce/fetch
> > > > > > > > > > > > >> > responses
> > > > > > > > > > > > >> > >> > will
> > > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > >> > >> > >> > > > impacted by this KIP. That will
> > > continue
> > > > > to
> > > > > > > > return
> > > > > > > > > > > > >> byte-rate
> > > > > > > > > > > > >> > >> based
> > > > > > > > > > > > >> > >> > >> > > > throttling times. In addition, a
> new
> > > > field
> > > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > >> > >> > >> > > > added to return request quota
> based
> > > > > > throttling
> > > > > > > > > > times.
> > > > > > > > > > > > >> These
> > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > > >> > >> > >> > > > as new metrics on the client-side.
> > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > >> > >> > >> > > > Since all metrics and sensors are
> > > > > different
> > > > > > > for
> > > > > > > > > each
> > > > > > > > > > > > type
> > > > > > > > > > > > >> of
> > > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > > >> > >> > >> > > > believe there is already
> sufficient
> > > > > metrics
> > > > > > to
> > > > > > > > > > monitor
> > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > > >> > >> > >> > > > client and broker side for each
> type
> > > of
> > > > > > > > > throttling.
> > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM,
> > Dong
> > > > Lin
> > > > > <
> > > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > > I think it makes a lot of sense
> to
> > > use
> > > > > > > > > > > io_thread_units
> > > > > > > > > > > > >> as
> > > > > > > > > > > > >> > >> metric
> > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > >> > >> > >> > > > > user's traffic here. LGTM
> > overall. I
> > > > > have
> > > > > > > some
> > > > > > > > > > > > questions
> > > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > > - Can you be more specific in
> the
> > > KIP
> > > > > what
> > > > > > > > > sensors
> > > > > > > > > > > > will
> > > > > > > > > > > > >> be
> > > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > >> > >> > >> > > > > example, it will be useful to
> > > specify
> > > > > the
> > > > > > > name
> > > > > > > > > and
> > > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > > - We currently have
> throttle-time
> > > and
> > > > > > > > queue-size
> > > > > > > > > > for
> > > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > > >> > >> > >> > > > > Are you going to have separate
> > > > > > throttle-time
> > > > > > > > and
> > > > > > > > > > > > >> queue-size
> > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > > >> > >> > >> > > > > throttled by
> io_thread_unit-based
> > > > quota,
> > > > > > or
> > > > > > > > will
> > > > > > > > > > > they
> > > > > > > > > > > > >> share
> > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > > - Does the throttle-time in the
> > > > > > > > ProduceResponse
> > > > > > > > > > and
> > > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > > >> > >> > >> > > > > time due to io_thread_unit-based
> > > > quota?
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > > - Currently kafka server doesn't
> > not
> > > > > > provide
> > > > > > > > any
> > > > > > > > > > log
> > > > > > > > > > > > or
> > > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > > >> > >> > >> > > > > whether any given clientId (or
> > user)
> > > > is
> > > > > > > > > throttled.
> > > > > > > > > > > > This
> > > > > > > > > > > > >> is
> > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > >> > >> > too
> > > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > > >> > >> > >> > > > > because we can still check the
> > > > > client-side
> > > > > > > > > > byte-rate
> > > > > > > > > > > > >> metric
> > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > > >> > >> > >> > > > > whether a given client is
> > throttled.
> > > > But
> > > > > > > with
> > > > > > > > > this
> > > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > > >> > >> > >> > > > > will be no way to validate
> > whether a
> > > > > given
> > > > > > > > > client
> > > > > > > > > > is
> > > > > > > > > > > > >> slow
> > > > > > > > > > > > >> > >> > because
> > > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > > >> > >> > >> > > > > exceeded its io_thread_unit
> limit.
> > > It
> > > > is
> > > > > > > > > necessary
> > > > > > > > > > > for
> > > > > > > > > > > > >> user
> > > > > > > > > > > > >> > >> to
> > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > >> > >> > >> > > > > know this information to figure
> > how
> > > > > > whether
> > > > > > > > they
> > > > > > > > > > > have
> > > > > > > > > > > > >> > reached
> > > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > >> > >> > >> > > > > limit. How about we add log4j
> log
> > on
> > > > the
> > > > > > > > server
> > > > > > > > > > side
> > > > > > > > > > > > to
> > > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > > > byte-rate-throttle-time,
> > > > > > > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > > >> > >> > >> > > > > that kafka administrator can
> > figure
> > > > > those
> > > > > > > > users
> > > > > > > > > > that
> > > > > > > > > > > > >> have
> > > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM,
> > > > > Guozhang
> > > > > > > > Wang <
> > > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > Made a pass over the doc,
> > overall
> > > > LGTM
> > > > > > > > except
> > > > > > > > > a
> > > > > > > > > > > > minor
> > > > > > > > > > > > >> > >> comment
> > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > Stated as "Request processing
> > time
> > > > > > > > throttling
> > > > > > > > > > will
> > > > > > > > > > > > be
> > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > >> > >> > on
> > > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > > >> > >> > >> > > > > > necessary." I thought that it
> > > meant
> > > > > the
> > > > > > > > > request
> > > > > > > > > > > > >> > processing
> > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > > >> > >> > >> > > > > > is applied first, but continue
> > > > > reading I
> > > > > > > > found
> > > > > > > > > > it
> > > > > > > > > > > > >> > actually
> > > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > > >> > >> > >> > > > > > produce / fetch byte rate
> > > throttling
> > > > > > > first.
> > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > Also the last sentence "The
> > > > remaining
> > > > > > > delay
> > > > > > > > if
> > > > > > > > > > any
> > > > > > > > > > > > is
> > > > > > > > > > > > >> > >> applied
> > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > > > > > response." is a bit confusing
> to
> > > me.
> > > > > > Maybe
> > > > > > > > > > > rewording
> > > > > > > > > > > > >> it a
> > > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24
> PM,
> > > Jun
> > > > > > Rao <
> > > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > > >> > >> >
> > > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > Thanks for the updated KIP.
> > The
> > > > > latest
> > > > > > > > > > proposal
> > > > > > > > > > > > >> looks
> > > > > > > > > > > > >> > >> good
> > > > > > > > > > > > >> > >> > to
> > > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19
> > PM,
> > > > > > Rajini
> > > > > > > > > > Sivaram
> > > > > > > > > > > <
> > > > > > > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > Thank you for the
> feedback.
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > 1. I have updated the KIP
> to
> > > use
> > > > > > > > absolute
> > > > > > > > > > > units
> > > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > > >> > >> > >> > > > > > > > property is called*
> > > > > io_thread_units*
> > > > > > > to
> > > > > > > > > > align
> > > > > > > > > > > > with
> > > > > > > > > > > > >> > the
> > > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > > >> > >> > >> > > > > > > > property *num.io.threads*.
> > > When
> > > > we
> > > > > > > > > implement
> > > > > > > > > > > > >> network
> > > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > > >> > >> > >> > > > > > > > quotas, we can add another
> > > > > property
> > > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is
> > > already
> > > > > > > listed
> > > > > > > > > > under
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > > >> > >> > >> > > > > > > > you mean a different
> request
> > > > that
> > > > > > > needs
> > > > > > > > to
> > > > > > > > > > be
> > > > > > > > > > > > >> added?
> > > > > > > > > > > > >> > >> The
> > > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > >> > >> > >> > > > > > > > currently exempt in the
> KIP
> > > are
> > > > > > > > > StopReplica,
> > > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and
> > > UpdateMetadata.
> > > > > > These
> > > > > > > > are
> > > > > > > > > > > > >> controlled
> > > > > > > > > > > > >> > >> > using
> > > > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy to
> > exclude
> > > > and
> > > > > > only
> > > > > > > > > > > throttle
> > > > > > > > > > > > if
> > > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > > >> > >> > >> > > > > > > > sure if there are other
> > > requests
> > > > > > used
> > > > > > > > only
> > > > > > > > > > for
> > > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was thinking the
> > smallest
> > > > > > change
> > > > > > > > > would
> > > > > > > > > > be
> > > > > > > > > > > > to
> > > > > > > > > > > > >> > >> replace
> > > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > *requestChannel.sendResponse()
> > > *
> > > > > > with
> > > > > > > a
> > > > > > > > > > local
> > > > > > > > > > > > >> method
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > *sendResponseMaybeThrottle()*
> > > > that
> > > > > > > does
> > > > > > > > > the
> > > > > > > > > > > > >> > throttling
> > > > > > > > > > > > >> > >> if
> > > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > > >> > >> > >> > > > > > > > response. If we throttle
> > first
> > > > in
> > > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > > >> > >> > >> > > > > > > > within the method handling
> > the
> > > > > > request
> > > > > > > > > will
> > > > > > > > > > > not
> > > > > > > > > > > > be
> > > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > > >> > >> > >> > > > > > > > throttling. We can look
> into
> > > > this
> > > > > > > again
> > > > > > > > > when
> > > > > > > > > > > the
> > > > > > > > > > > > >> PR
> > > > > > > > > > > > >> > is
> > > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at
> 5:55
> > > PM,
> > > > > > Roger
> > > > > > > > > > Hoover
> > > > > > > > > > > <
> > > > > > > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > Great to see this KIP
> and
> > > the
> > > > > > > > excellent
> > > > > > > > > > > > >> discussion.
> > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's suggestion
> > > makes
> > > > > > sense.
> > > > > > > > If
> > > > > > > > > > my
> > > > > > > > > > > > >> > >> application
> > > > > > > > > > > > >> > >> > is
> > > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > > >> > >> > >> > > > > > > > > request handler unit,
> then
> > > > it's
> > > > > as
> > > > > > > if
> > > > > > > > I
> > > > > > > > > > > have a
> > > > > > > > > > > > >> > Kafka
> > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > > >> > >> > >> > > > > > > > > request handler thread
> > > > dedicated
> > > > > > to
> > > > > > > > me.
> > > > > > > > > > > > That's
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > > >> > >> > >> > > > > > > > > least.  That allocation
> > > > doesn't
> > > > > > > change
> > > > > > > > > > even
> > > > > > > > > > > if
> > > > > > > > > > > > >> an
> > > > > > > > > > > > >> > >> admin
> > > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > size of the request
> thread
> > > > pool
> > > > > on
> > > > > > > the
> > > > > > > > > > > broker.
> > > > > > > > > > > > >> > It's
> > > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > > >> > >> > >> > > > > > > > > abstraction that VMs and
> > > > > > containers
> > > > > > > > get
> > > > > > > > > > from
> > > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > While different client
> > > access
> > > > > > > patterns
> > > > > > > > > can
> > > > > > > > > > > use
> > > > > > > > > > > > >> > wildly
> > > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > >> > >> > >> > > > > > > > > request thread resources
> > per
> > > > > > > request,
> > > > > > > > a
> > > > > > > > > > > given
> > > > > > > > > > > > >> > >> > application
> > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > > >> > >> > >> > > > > > > > > have a stable access
> > pattern
> > > > and
> > > > > > can
> > > > > > > > > > figure
> > > > > > > > > > > > out
> > > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > > >> > >> > >> > > > > > > > > "request thread units"
> it
> > > > needs
> > > > > to
> > > > > > > > meet
> > > > > > > > > > it's
> > > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at
> > 8:53
> > > > AM,
> > > > > > Jun
> > > > > > > > > Rao <
> > > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for the updated
> > > KIP.
> > > > A
> > > > > > few
> > > > > > > > more
> > > > > > > > > > > > >> comments.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> > > > > > > request_time_percent
> > > > > > > > > is
> > > > > > > > > > > that
> > > > > > > > > > > > >> it's
> > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > >> > >> > an
> > > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > Let's say you give a
> > user
> > > a
> > > > > 10%
> > > > > > > > limit.
> > > > > > > > > > If
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > admin
> > > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > request handler
> threads,
> > > > that
> > > > > > user
> > > > > > > > now
> > > > > > > > > > > > >> actually
> > > > > > > > > > > > >> > has
> > > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > capacity. This may
> > confuse
> > > > > > people
> > > > > > > a
> > > > > > > > > bit.
> > > > > > > > > > > So,
> > > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > based on an absolute
> > > request
> > > > > > > thread
> > > > > > > > > unit
> > > > > > > > > > > is
> > > > > > > > > > > > >> > better.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > > ControlledShutdownRequest
> > > > > is
> > > > > > > also
> > > > > > > > > an
> > > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > be excluded from
> > > throttling.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > 3. Implementation
> wise,
> > I
> > > am
> > > > > > > > wondering
> > > > > > > > > > if
> > > > > > > > > > > > it's
> > > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > time throttling first
> in
> > > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > the throttling logic
> in
> > > each
> > > > > > type
> > > > > > > of
> > > > > > > > > > > > request.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017
> at
> > > 5:58
> > > > > AM,
> > > > > > > > > Rajini
> > > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank you for the
> > > review.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I have reverted to
> the
> > > > > > original
> > > > > > > > KIP
> > > > > > > > > > that
> > > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > utilization. At the
> > > > moment,
> > > > > it
> > > > > > > > uses
> > > > > > > > > > > > >> percentage,
> > > > > > > > > > > > >> > >> but
> > > > > > > > > > > > >> > >> > I
> > > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > a fraction (out of 1
> > > > instead
> > > > > > of
> > > > > > > > 100)
> > > > > > > > > > if
> > > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > from this discussion
> > to
> > > > the
> > > > > > KIP.
> > > > > > > > > Also
> > > > > > > > > > > > added
> > > > > > > > > > > > >> a
> > > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > address network
> thread
> > > > > > > > utilization.
> > > > > > > > > > The
> > > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> "request_time_percent"
> > > > with
> > > > > > the
> > > > > > > > > > > > expectation
> > > > > > > > > > > > >> > that
> > > > > > > > > > > > >> > >> it
> > > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit for network
> > thread
> > > > > > > > utilization
> > > > > > > > > > > when
> > > > > > > > > > > > >> that
> > > > > > > > > > > > >> > is
> > > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > users have to set
> only
> > > one
> > > > > > > config
> > > > > > > > > for
> > > > > > > > > > > the
> > > > > > > > > > > > >> two
> > > > > > > > > > > > >> > and
> > > > > > > > > > > > >> > >> > not
> > > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the internal
> > > distribution
> > > > of
> > > > > > the
> > > > > > > > > work
> > > > > > > > > > > > >> between
> > > > > > > > > > > > >> > the
> > > > > > > > > > > > >> > >> > two
> > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017
> > at
> > > > > 12:23
> > > > > > > AM,
> > > > > > > > > Jun
> > > > > > > > > > > Rao
> > > > > > > > > > > > <
> > > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for the
> > > proposal.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > The benefit of
> using
> > > the
> > > > > > > request
> > > > > > > > > > > > >> processing
> > > > > > > > > > > > >> > >> time
> > > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > exactly what
> people
> > > have
> > > > > > > said. I
> > > > > > > > > > will
> > > > > > > > > > > > just
> > > > > > > > > > > > >> > >> expand
> > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > following case.
> The
> > > > > producer
> > > > > > > > > sends a
> > > > > > > > > > > > >> produce
> > > > > > > > > > > > >> > >> > request
> > > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > but compressed to
> > > 100KB
> > > > > with
> > > > > > > > gzip.
> > > > > > > > > > The
> > > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > broker could take
> > > 10-15
> > > > > > > seconds,
> > > > > > > > > > > during
> > > > > > > > > > > > >> which
> > > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > thread is
> completely
> > > > > > blocked.
> > > > > > > In
> > > > > > > > > > this
> > > > > > > > > > > > >> case,
> > > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request rate
> > quota
> > > > may
> > > > > > be
> > > > > > > > > > > effective
> > > > > > > > > > > > in
> > > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > another case. A
> > > consumer
> > > > > > group
> > > > > > > > > > starts
> > > > > > > > > > > > >> with 10
> > > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > switches to 20
> > > > instances.
> > > > > > The
> > > > > > > > > > request
> > > > > > > > > > > > rate
> > > > > > > > > > > > >> > will
> > > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > actually load on
> the
> > > > > broker
> > > > > > > may
> > > > > > > > > not
> > > > > > > > > > > > double
> > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > contains half of
> the
> > > > > > > partitions.
> > > > > > > > > > > Request
> > > > > > > > > > > > >> rate
> > > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure in this
> > > case.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > What we really
> want
> > is
> > > > to
> > > > > be
> > > > > > > > able
> > > > > > > > > to
> > > > > > > > > > > > >> prevent
> > > > > > > > > > > > >> > a
> > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > of the server side
> > > > > > resources.
> > > > > > > In
> > > > > > > > > > this
> > > > > > > > > > > > >> > >> particular
> > > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > capacity of the
> > > request
> > > > > > > handler
> > > > > > > > > > > > threads. I
> > > > > > > > > > > > >> > >> agree
> > > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > intuitive for the
> > > users
> > > > to
> > > > > > > > > determine
> > > > > > > > > > > how
> > > > > > > > > > > > >> to
> > > > > > > > > > > > >> > set
> > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > this is not
> > completely
> > > > new
> > > > > > and
> > > > > > > > has
> > > > > > > > > > > been
> > > > > > > > > > > > >> done
> > > > > > > > > > > > >> > in
> > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > already. For
> > example,
> > > > > Linux
> > > > > > > > > cgroup (
> > > > > > > > > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > Resource_Management_Guide/sec-
> > > > > > > > > > > cpu.html)
> > > > > > > > > > > > >> has
> > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > which specifies
> the
> > > > total
> > > > > > > amount
> > > > > > > > > of
> > > > > > > > > > > time
> > > > > > > > > > > > >> in
> > > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup
> > can
> > > > run
> > > > > > > > during a
> > > > > > > > > > one
> > > > > > > > > > > > >> second
> > > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > model the request
> > > > handler
> > > > > > > > threads
> > > > > > > > > > in a
> > > > > > > > > > > > >> > similar
> > > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > request handler
> > thread
> > > > can
> > > > > > be
> > > > > > > 1
> > > > > > > > > > > request
> > > > > > > > > > > > >> > handler
> > > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure a limit
> on
> > > how
> > > > > > many
> > > > > > > > > units
> > > > > > > > > > > (say
> > > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > > >> > >> a
> > > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Regarding not
> > > throttling
> > > > > the
> > > > > > > > > > internal
> > > > > > > > > > > > >> broker
> > > > > > > > > > > > >> > to
> > > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > do that.
> > > Alternatively,
> > > > we
> > > > > > > could
> > > > > > > > > > just
> > > > > > > > > > > > let
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for the kafka user
> > (it
> > > > may
> > > > > > not
> > > > > > > > be
> > > > > > > > > > able
> > > > > > > > > > > > to
> > > > > > > > > > > > >> do
> > > > > > > > > > > > >> > >> that
> > > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we want to
> > be
> > > > able
> > > > > > to
> > > > > > > > > > protect
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool too. The
> > > difficult
> > > > is
> > > > > > > > mostly
> > > > > > > > > > what
> > > > > > > > > > > > >> Rajini
> > > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttling the
> > > requests
> > > > is
> > > > > > > > through
> > > > > > > > > > > > >> Purgatory
> > > > > > > > > > > > >> > >> and
> > > > > > > > > > > > >> > >> > we
> > > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > through how to
> > > integrate
> > > > > > that
> > > > > > > > into
> > > > > > > > > > the
> > > > > > > > > > > > >> > network
> > > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > layer, currently
> we
> > > know
> > > > > the
> > > > > > > > user,
> > > > > > > > > > but
> > > > > > > > > > > > not
> > > > > > > > > > > > >> > the
> > > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit tricky
> to
> > > > > > throttle
> > > > > > > > > based
> > > > > > > > > > on
> > > > > > > > > > > > >> > clientId
> > > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota can already
> > > > protect
> > > > > > the
> > > > > > > > > > network
> > > > > > > > > > > > >> thread
> > > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > requests. So, if
> we
> > > > can't
> > > > > > > figure
> > > > > > > > > out
> > > > > > > > > > > > this
> > > > > > > > > > > > >> > part
> > > > > > > > > > > > >> > >> > right
> > > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request
> handling
> > > > > threads
> > > > > > > for
> > > > > > > > > > this
> > > > > > > > > > > > KIP
> > > > > > > > > > > > >> is
> > > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21,
> 2017
> > > at
> > > > > 4:27
> > > > > > > AM,
> > > > > > > > > > > Rajini
> > > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > > >> > >> > >> > > > > > > > > >
> rajinisivaram@gmail.com
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank you all
> for
> > > the
> > > > > > > > feedback.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have
> > removed
> > > > > > > exemption
> > > > > > > > > for
> > > > > > > > > > > > >> consumer
> > > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > protecting the
> > > cluster
> > > > > is
> > > > > > > more
> > > > > > > > > > > > important
> > > > > > > > > > > > >> > than
> > > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have retained
> the
> > > > > > exemption
> > > > > > > > for
> > > > > > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled only
> if
> > > > > > > > authorization
> > > > > > > > > > > fails
> > > > > > > > > > > > >> (so
> > > > > > > > > > > > >> > >> can't
> > > > > > > > > > > > >> > >> > be
> > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a secure
> cluster,
> > > but
> > > > > > allows
> > > > > > > > > > > > >> inter-broker
> > > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I will wait
> > another
> > > > day
> > > > > to
> > > > > > > see
> > > > > > > > > if
> > > > > > > > > > > > these
> > > > > > > > > > > > >> is
> > > > > > > > > > > > >> > >> any
> > > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > request
> processing
> > > > time
> > > > > > (as
> > > > > > > > > > opposed
> > > > > > > > > > > to
> > > > > > > > > > > > >> > >> request
> > > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > objections, I
> will
> > > > > revert
> > > > > > to
> > > > > > > > the
> > > > > > > > > > > > >> original
> > > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > The original
> > > proposal
> > > > > was
> > > > > > > only
> > > > > > > > > > > > including
> > > > > > > > > > > > >> > the
> > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handler threads
> > > (that
> > > > > made
> > > > > > > > > > > calculation
> > > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > include the time
> > > spent
> > > > > in
> > > > > > > the
> > > > > > > > > > > network
> > > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant. As
> > Jay
> > > > > > pointed
> > > > > > > > out,
> > > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > >> more
> > > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > total available
> > CPU
> > > > time
> > > > > > and
> > > > > > > > > > convert
> > > > > > > > > > > > to
> > > > > > > > > > > > >> a
> > > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and *n* network
> > > > threads.
> > > > > > > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we want, but it
> > can
> > > be
> > > > > > very
> > > > > > > > > > > expensive
> > > > > > > > > > > > on
> > > > > > > > > > > > >> > some
> > > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang have
> > > pointed
> > > > > out,
> > > > > > > we
> > > > > > > > do
> > > > > > > > > > > have
> > > > > > > > > > > > >> > several
> > > > > > > > > > > > >> > >> > time
> > > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > generating
> metrics
> > > > that
> > > > > we
> > > > > > > > could
> > > > > > > > > > > use,
> > > > > > > > > > > > >> > though
> > > > > > > > > > > > >> > >> we
> > > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime()
> instead
> > > of
> > > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > > >> > >> since
> > > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > small requests
> may
> > > be
> > > > <
> > > > > > 1ms.
> > > > > > > > But
> > > > > > > > > > > > rather
> > > > > > > > > > > > >> > than
> > > > > > > > > > > > >> > >> add
> > > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thread and
> network
> > > > > thread,
> > > > > > > > > > wouldn't
> > > > > > > > > > > it
> > > > > > > > > > > > >> be
> > > > > > > > > > > > >> > >> better
> > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on each thread
> > into
> > > a
> > > > > > > separate
> > > > > > > > > > > ratio?
> > > > > > > > > > > > >> UserA
> > > > > > > > > > > > >> > >> has
> > > > > > > > > > > > >> > >> > a
> > > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we take that to
> > mean
> > > > > that
> > > > > > > > UserA
> > > > > > > > > > can
> > > > > > > > > > > > use
> > > > > > > > > > > > >> 5%
> > > > > > > > > > > > >> > of
> > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of the
> time
> > > on
> > > > > I/O
> > > > > > > > > threads?
> > > > > > > > > > > If
> > > > > > > > > > > > >> > either
> > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled - it
> > would
> > > > > mean
> > > > > > > > > > > maintaining
> > > > > > > > > > > > >> two
> > > > > > > > > > > > >> > >> sets
> > > > > > > > > > > > >> > >> > of
> > > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > durations, but
> > would
> > > > > > result
> > > > > > > in
> > > > > > > > > > more
> > > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quota limits
> > (UserA
> > > > has
> > > > > 5%
> > > > > > > of
> > > > > > > > > > > request
> > > > > > > > > > > > >> > threads
> > > > > > > > > > > > >> > >> > and
> > > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > but that seems
> > > > > unnecessary
> > > > > > > and
> > > > > > > > > > > harder
> > > > > > > > > > > > to
> > > > > > > > > > > > >> > >> explain
> > > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back to why and
> > how
> > > > > quotas
> > > > > > > are
> > > > > > > > > > > applied
> > > > > > > > > > > > >> to
> > > > > > > > > > > > >> > >> > network
> > > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In the case
> of
> > > > fetch,
> > > > > > > the
> > > > > > > > > time
> > > > > > > > > > > > >> spent in
> > > > > > > > > > > > >> > >> the
> > > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant and
> I
> > > can
> > > > > see
> > > > > > > the
> > > > > > > > > need
> > > > > > > > > > > to
> > > > > > > > > > > > >> > include
> > > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > requests where
> the
> > > > > network
> > > > > > > > > thread
> > > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of fetch,
> request
> > > > > handler
> > > > > > > > thread
> > > > > > > > > > > > >> > utilization
> > > > > > > > > > > > >> > >> > would
> > > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > high request
> rate,
> > > low
> > > > > > data
> > > > > > > > > volume
> > > > > > > > > > > and
> > > > > > > > > > > > >> > fetch
> > > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > clients with
> high
> > > data
> > > > > > > volume.
> > > > > > > > > > > Network
> > > > > > > > > > > > >> > thread
> > > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > proportional to
> > the
> > > > data
> > > > > > > > > volume. I
> > > > > > > > > > > am
> > > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > based on network
> > > > thread
> > > > > > > > > > utilization
> > > > > > > > > > > or
> > > > > > > > > > > > >> > >> whether
> > > > > > > > > > > > >> > >> > the
> > > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At the
> moment,
> > we
> > > > > > record
> > > > > > > > and
> > > > > > > > > > > check
> > > > > > > > > > > > >> for
> > > > > > > > > > > > >> > >> quota
> > > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a quota is
> > > > violated,
> > > > > > the
> > > > > > > > > > response
> > > > > > > > > > > > is
> > > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk reads for
> > > fetches
> > > > > > > > happening
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > >> > >> network
> > > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delay a response
> > > after
> > > > > the
> > > > > > > > disk
> > > > > > > > > > > reads.
> > > > > > > > > > > > >> We
> > > > > > > > > > > > >> > >> could
> > > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > the network
> thread
> > > > when
> > > > > > the
> > > > > > > > > > response
> > > > > > > > > > > > is
> > > > > > > > > > > > >> > >> complete
> > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling a
> > > subsequent
> > > > > > > request
> > > > > > > > > > > > (separate
> > > > > > > > > > > > >> out
> > > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling in the
> > case
> > > > of
> > > > > > > > network
> > > > > > > > > > > thread
> > > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21,
> > 2017
> > > > at
> > > > > > 2:58
> > > > > > > > AM,
> > > > > > > > > > > > Becket
> > > > > > > > > > > > >> > Qin <
> > > > > > > > > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree
> > that
> > > > > > > enforcing
> > > > > > > > > the
> > > > > > > > > > > CPU
> > > > > > > > > > > > >> time
> > > > > > > > > > > > >> > >> is a
> > > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > that maybe we
> > can
> > > > use
> > > > > > the
> > > > > > > > > > existing
> > > > > > > > > > > > >> > request
> > > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > very detailed
> so
> > > we
> > > > > can
> > > > > > > > > probably
> > > > > > > > > > > see
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > something like
> > > > > > > (total_time -
> > > > > > > > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I agree with
> > > > Guozhang
> > > > > > that
> > > > > > > > > when
> > > > > > > > > > a
> > > > > > > > > > > > >> user is
> > > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > need to see if
> > > > > anything
> > > > > > > has
> > > > > > > > > went
> > > > > > > > > > > > wrong
> > > > > > > > > > > > >> > >> first,
> > > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > behaving and
> > just
> > > > need
> > > > > > > more
> > > > > > > > > > > > >> resources, we
> > > > > > > > > > > > >> > >> will
> > > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for them. It
> is
> > > true
> > > > > > that
> > > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > users is
> > > difficult.
> > > > So
> > > > > > in
> > > > > > > > > > practice
> > > > > > > > > > > > it
> > > > > > > > > > > > >> > would
> > > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a relative
> high
> > > > > > protective
> > > > > > > > CPU
> > > > > > > > > > > time
> > > > > > > > > > > > >> quota
> > > > > > > > > > > > >> > >> for
> > > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for some
> > > individual
> > > > > > > clients
> > > > > > > > on
> > > > > > > > > > > > demand.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie
> > (Becket)
> > > > Qin
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb
> 20,
> > > 2017
> > > > > at
> > > > > > > 5:48
> > > > > > > > > PM,
> > > > > > > > > > > > >> Guozhang
> > > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > This is a
> > great
> > > > > > > proposal,
> > > > > > > > > glad
> > > > > > > > > > > to
> > > > > > > > > > > > >> see
> > > > > > > > > > > > >> > it
> > > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am
> inclined
> > to
> > > > the
> > > > > > CPU
> > > > > > > > > > > > >> throttling, or
> > > > > > > > > > > > >> > >> more
> > > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio
> instead
> > of
> > > > the
> > > > > > > > request
> > > > > > > > > > > rate
> > > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > summed my
> > > > rationales
> > > > > > > > above,
> > > > > > > > > > and
> > > > > > > > > > > > one
> > > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > has a good
> > > support
> > > > > for
> > > > > > > > both
> > > > > > > > > > > > >> "protecting
> > > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a
> > > > cluster
> > > > > > for
> > > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > explain this
> > to
> > > > the
> > > > > > end
> > > > > > > > > > users, I
> > > > > > > > > > > > >> find
> > > > > > > > > > > > >> > it
> > > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > request rate
> > > since
> > > > > as
> > > > > > > > > > mentioned
> > > > > > > > > > > > >> above,
> > > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > different
> > > "cost",
> > > > > and
> > > > > > > > Kafka
> > > > > > > > > > > today
> > > > > > > > > > > > >> > already
> > > > > > > > > > > > >> > >> > have
> > > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > (produce,
> > fetch,
> > > > > > admin,
> > > > > > > > > > > metadata,
> > > > > > > > > > > > >> etc),
> > > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttling
> may
> > > not
> > > > > be
> > > > > > as
> > > > > > > > > > > effective
> > > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > Regarding to
> > > user
> > > > > > > > reactions
> > > > > > > > > > when
> > > > > > > > > > > > >> they
> > > > > > > > > > > > >> > are
> > > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> case-by-case,
> > > and
> > > > > need
> > > > > > > to
> > > > > > > > be
> > > > > > > > > > > > >> > discovered /
> > > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > metrics. So
> in
> > > > other
> > > > > > > words
> > > > > > > > > > users
> > > > > > > > > > > > >> would
> > > > > > > > > > > > >> > >> not
> > > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > information
> by
> > > > > simply
> > > > > > > > being
> > > > > > > > > > told
> > > > > > > > > > > > >> "hey,
> > > > > > > > > > > > >> > >> you
> > > > > > > > > > > > >> > >> > are
> > > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > what
> > throttling
> > > > > does;
> > > > > > > they
> > > > > > > > > > need
> > > > > > > > > > > to
> > > > > > > > > > > > >> > take a
> > > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttled
> > > probably
> > > > > > > because
> > > > > > > > > of
> > > > > > > > > > > ..",
> > > > > > > > > > > > >> > which
> > > > > > > > > > > > >> > >> is
> > > > > > > > > > > > >> > >> > by
> > > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > values: e.g.
> > > > whether
> > > > > > I'm
> > > > > > > > > > > > bombarding
> > > > > > > > > > > > >> the
> > > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > > >>
> > > > > > > > > > > > > ...
> > > > > > > > > > > > >
> > > > > > > > > > > > > [Message clipped]
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > *Todd Palino*
> > > Staff Site Reliability Engineer
> > > Data Infrastructure Streaming
> > >
> > >
> > >
> > > linkedin.com/in/toddpalino
> > >
> >
>
>
>
> --
> *Todd Palino*
> Staff Site Reliability Engineer
> Data Infrastructure Streaming
>
>
>
> linkedin.com/in/toddpalino
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Todd Palino <tp...@gmail.com>.
Rajini -

I understand what you’re saying, but the point I’m making is that I don’t
believe we need to take it into account directly. The CPU utilization of
the network threads is directly proportional to the number of bytes being
sent. The more bytes, the more CPU that is required for SSL (or other
tasks). This is opposed to the request handler threads, where there are a
number of factors that affect CPU utilization. This means that it’s not
necessary to separately quota network thread byte usage and CPU - if we
quota byte usage (which we already do), we have fixed the CPU usage at a
proportional amount.

Jun -

Thanks for the clarification there. I was thinking of the utilization
percentage as being fixed, not what the percentage reflects. I’m not tied
to either way of doing it, provided that we do not lock clients to a single
thread. For example, if I specify that a given client can use 10% of a
single thread, that should also mean they can use 1% on 10 threads.

-Todd



On Wed, Mar 8, 2017 at 8:57 AM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Todd,
>
> Thanks for the feedback.
>
> I just want to clarify your second point. If the limit percentage is per
> thread and the thread counts are changed, the absolute processing limit for
> existing users haven't changed and there is no need to adjust them. On the
> other hand, if the limit percentage is of total thread pool capacity and
> the thread counts are changed, the effective processing limit for a user
> will change. So, to preserve the current processing limit, existing user
> limits have to be adjusted. If there is a hardware change, the effective
> processing limit for a user will change in either approach and the existing
> limit may need to be adjusted. However, hardware changes are less common
> than thread pool configuration changes.
>
> Thanks,
>
> Jun
>
> On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <tp...@gmail.com> wrote:
>
> > I’ve been following this one on and off, and overall it sounds good to
> me.
> >
> > - The SSL question is a good one. However, that type of overhead should
> be
> > proportional to the bytes rate, so I think that a bytes rate quota would
> > still be a suitable way to address it.
> >
> > - I think it’s better to make the quota percentage of total thread pool
> > capacity, and not percentage of an individual thread. That way you don’t
> > have to adjust it when you adjust thread counts (tuning, hardware
> changes,
> > etc.)
> >
> >
> > -Todd
> >
> >
> >
> > On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <be...@gmail.com> wrote:
> >
> > > I see. Good point about SSL.
> > >
> > > I just asked Todd to take a look.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Jiangjie,
> > > >
> > > > Yes, I agree that byte rate already protects the network threads
> > > > indirectly. I am not sure if byte rate fully captures the CPU
> overhead
> > in
> > > > network due to SSL. So, at the high level, we can use request time
> > limit
> > > to
> > > > protect CPU and use byte rate to protect storage and network.
> > > >
> > > > Also, do you think you can get Todd to comment on this KIP?
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <be...@gmail.com>
> > > wrote:
> > > >
> > > > > Hi Rajini/Jun,
> > > > >
> > > > > The percentage based reasoning sounds good.
> > > > > One thing I am wondering is that if we assume the network thread
> are
> > > just
> > > > > doing the network IO, can we say bytes rate quota is already sort
> of
> > > > > network threads quota?
> > > > > If we take network threads into the consideration here, would that
> be
> > > > > somewhat overlapping with the bytes rate quota?
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > > rajinisivaram@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Jun,
> > > > > >
> > > > > > Thank you for the explanation, I hadn't realized you meant
> > percentage
> > > > of
> > > > > > the total thread pool. If everyone is OK with Jun's suggestion, I
> > > will
> > > > > > update the KIP.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Rajini
> > > > > >
> > > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Hi, Rajini,
> > > > > > >
> > > > > > > Let's take your example. Let's say a user sets the limit to
> 50%.
> > I
> > > am
> > > > > not
> > > > > > > sure if it's better to apply the same percentage separately to
> > > > network
> > > > > > and
> > > > > > > io thread pool. For example, for produce requests, most of the
> > time
> > > > > will
> > > > > > be
> > > > > > > spent in the io threads whereas for fetch requests, most of the
> > > time
> > > > > will
> > > > > > > be in the network threads. So, using the same percentage in
> both
> > > > thread
> > > > > > > pools means one of the pools' resource will be over allocated.
> > > > > > >
> > > > > > > An alternative way is to simply model network and io thread
> pool
> > > > > > together.
> > > > > > > If you get 10 io threads and 5 network threads, you get 1500%
> > > request
> > > > > > > processing power. A 50% limit means a total of 750% processing
> > > power.
> > > > > We
> > > > > > > just add up the time a user request spent in either network or
> io
> > > > > thread.
> > > > > > > If that total exceeds 750% (doesn't matter whether it's spent
> > more
> > > in
> > > > > > > network or io thread), the request will be throttled. This
> seems
> > > more
> > > > > > > general and is not sensitive to the current implementation
> detail
> > > of
> > > > > > having
> > > > > > > a separate network and io thread pool. In the future, if the
> > > > threading
> > > > > > > model changes, the same concept of quota can still be applied.
> > For
> > > > now,
> > > > > > > since it's a bit tricky to add the delay logic in the network
> > > thread
> > > > > > pool,
> > > > > > > we could probably just do the delaying only in the io threads
> as
> > > you
> > > > > > > suggested earlier.
> > > > > > >
> > > > > > > There is still the orthogonal question of whether a quota of
> 50%
> > is
> > > > out
> > > > > > of
> > > > > > > 100% or 100% * #total processing threads. My feeling is that
> the
> > > > latter
> > > > > > is
> > > > > > > slightly better based on my explanation earlier. The way to
> > > describe
> > > > > this
> > > > > > > quota to the users can be "share of elapsed request processing
> > time
> > > > on
> > > > > a
> > > > > > > single CPU" (similar to top).
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > > > rajinisivaram@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Jun,
> > > > > > > >
> > > > > > > > Agree about the two scenarios.
> > > > > > > >
> > > > > > > > But still not sure about a single quota covering both network
> > > > threads
> > > > > > and
> > > > > > > > I/O threads with per-thread quota. If there are 10 I/O
> threads
> > > and
> > > > 5
> > > > > > > > network threads and I want to assign half the quota to userA,
> > the
> > > > > quota
> > > > > > > > would be 750%. I imagine, internally, we would convert this
> to
> > > 500%
> > > > > for
> > > > > > > I/O
> > > > > > > > and 250% for network threads to allocate 50% of each pool.
> > > > > > > >
> > > > > > > > A couple of scenarios:
> > > > > > > >
> > > > > > > > 1. Admin adds 1 extra network thread. To retain 50%, admin
> > needs
> > > to
> > > > > now
> > > > > > > > allocate 800% for each user. Or increase the quota for a few
> > > users.
> > > > > To
> > > > > > > me,
> > > > > > > > it feels like admin needs to convert 50% to 800% and Kafka
> > > > internally
> > > > > > > needs
> > > > > > > > to convert 800% to (500%, 300%). Everyone using just 50%
> feels
> > a
> > > > lot
> > > > > > > > simpler.
> > > > > > > >
> > > > > > > > 2. We decide to add some other thread to this list. Admin
> needs
> > > to
> > > > > know
> > > > > > > > exactly how many threads form the maximum quota. And we can
> be
> > > > > changing
> > > > > > > > this between broker versions as we add more to the list.
> Again
> > a
> > > > > single
> > > > > > > > overall percent would be a lot simpler.
> > > > > > > >
> > > > > > > > There were others who were unconvinced by a single percent
> from
> > > the
> > > > > > > initial
> > > > > > > > proposal and were happier with thread units similar to CPU
> > units,
> > > > so
> > > > > I
> > > > > > am
> > > > > > > > ok with going with per-thread quotas (as units or percent).
> > Just
> > > > not
> > > > > > sure
> > > > > > > > it makes it easier for admin in all cases.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > Rajini
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io>
> > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Rajini,
> > > > > > > > >
> > > > > > > > > Consider modeling as n * 100% unit. For 2), the question is
> > > > what's
> > > > > > > > causing
> > > > > > > > > the I/O threads to be saturated. It's unlikely that all
> > users'
> > > > > > > > utilization
> > > > > > > > > have increased at the same. A more likely case is that a
> few
> > > > > isolated
> > > > > > > > > users' utilization have increased. If so, after increasing
> > the
> > > > > number
> > > > > > > of
> > > > > > > > > threads, the admin just needs to adjust the quota for a few
> > > > > isolated
> > > > > > > > users,
> > > > > > > > > which is expected and is less work.
> > > > > > > > >
> > > > > > > > > Consider modeling as 1 * 100% unit. For 1), all users'
> quota
> > > need
> > > > > to
> > > > > > be
> > > > > > > > > adjusted, which is unexpected and is more work.
> > > > > > > > >
> > > > > > > > > So, to me, the n * 100% model seems more convenient.
> > > > > > > > >
> > > > > > > > > As for future extension to cover network thread
> utilization,
> > I
> > > > was
> > > > > > > > thinking
> > > > > > > > > that one way is to simply model the capacity as (n + m) *
> > 100%
> > > > > unit,
> > > > > > > > where
> > > > > > > > > n and m are the number of network and i/o threads,
> > > respectively.
> > > > > > Then,
> > > > > > > > for
> > > > > > > > > each user, we can just add up the utilization in the
> network
> > > and
> > > > > the
> > > > > > > i/o
> > > > > > > > > thread. If we do this, we don't need a new type of quota.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > > > > > rajinisivaram@gmail.com
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Jun,
> > > > > > > > > >
> > > > > > > > > > If we use request.percentage as the percentage used in a
> > > single
> > > > > I/O
> > > > > > > > > thread,
> > > > > > > > > > the total percentage being allocated will be
> > num.io.threads *
> > > > 100
> > > > > > for
> > > > > > > > I/O
> > > > > > > > > > threads and num.network.threads * 100 for network
> threads.
> > A
> > > > > single
> > > > > > > > quota
> > > > > > > > > > covering the two as a percentage wouldn't quite work if
> you
> > > > want
> > > > > to
> > > > > > > > > > allocate the same proportion in both cases. If we want to
> > > treat
> > > > > > > threads
> > > > > > > > > as
> > > > > > > > > > separate units, won't we need two quota configurations
> > > > regardless
> > > > > > of
> > > > > > > > > > whether we use units or percentage? Perhaps I
> misunderstood
> > > > your
> > > > > > > > > > suggestion.
> > > > > > > > > >
> > > > > > > > > > I think there are two cases:
> > > > > > > > > >
> > > > > > > > > >    1. The use case that you mentioned where an admin is
> > > adding
> > > > > more
> > > > > > > > users
> > > > > > > > > >    and decides to add more I/O threads and expects to
> find
> > > free
> > > > > > quota
> > > > > > > > to
> > > > > > > > > >    allocate for new users.
> > > > > > > > > >    2. Admin adds more I/O threads because the I/O threads
> > are
> > > > > > > saturated
> > > > > > > > > and
> > > > > > > > > >    there are cores available to allocate, even though the
> > > > number
> > > > > or
> > > > > > > > > >    users/clients hasn't changed.
> > > > > > > > > >
> > > > > > > > > > If we allocated treated I/O threads as a single unit of
> > 100%,
> > > > all
> > > > > > > user
> > > > > > > > > > quotas need to be reallocated for 1). If we allocated I/O
> > > > threads
> > > > > > as
> > > > > > > n
> > > > > > > > > > units with n*100%, all user quotas need to be reallocated
> > for
> > > > 2),
> > > > > > > > > otherwise
> > > > > > > > > > some of the new threads may just not be used. Either way
> it
> > > > > should
> > > > > > be
> > > > > > > > > easy
> > > > > > > > > > to write a script to decrease/increase quotas by a
> multiple
> > > for
> > > > > all
> > > > > > > > > users.
> > > > > > > > > >
> > > > > > > > > > So it really boils down to which quota unit is most
> > intuitive
> > > > in
> > > > > > > terms
> > > > > > > > of
> > > > > > > > > > configuration. And from the discussion so far, it feels
> > like
> > > > > > opinion
> > > > > > > is
> > > > > > > > > > divided on whether quotas should be carved out of an
> > absolute
> > > > > 100%
> > > > > > > (or
> > > > > > > > 1
> > > > > > > > > > unit) or be relative to the number of threads (n*100% or
> n
> > > > > units).
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <
> jun@confluent.io>
> > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Another way to express an absolute limit is to use
> > > > > > > > request.percentage,
> > > > > > > > > > but
> > > > > > > > > > > treat it as the percentage used in a single request
> > > handling
> > > > > > > thread.
> > > > > > > > > For
> > > > > > > > > > > now, the request handling threads can be just the io
> > > threads.
> > > > > In
> > > > > > > the
> > > > > > > > > > > future, they can cover the network threads as well.
> This
> > is
> > > > > > similar
> > > > > > > > to
> > > > > > > > > > how
> > > > > > > > > > > top reports CPU usage and may be a bit easier for
> people
> > to
> > > > > > > > understand.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <
> > > jun@confluent.io>
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi, Jay,
> > > > > > > > > > > >
> > > > > > > > > > > > 2. Regarding request.unit vs request.percentage. I
> > > started
> > > > > with
> > > > > > > > > > > > request.percentage too. The reasoning for
> request.unit
> > is
> > > > the
> > > > > > > > > > following.
> > > > > > > > > > > > Suppose that the capacity has been reached on a
> broker
> > > and
> > > > > the
> > > > > > > > admin
> > > > > > > > > > > needs
> > > > > > > > > > > > to add a new user. A simple way to increase the
> > capacity
> > > is
> > > > > to
> > > > > > > > > increase
> > > > > > > > > > > the
> > > > > > > > > > > > number of io threads, assuming there are still enough
> > > > cores.
> > > > > If
> > > > > > > the
> > > > > > > > > > limit
> > > > > > > > > > > > is based on percentage, the additional capacity
> > > > automatically
> > > > > > > gets
> > > > > > > > > > > > distributed to existing users and we haven't really
> > > carved
> > > > > out
> > > > > > > any
> > > > > > > > > > > > additional resource for the new user. Now, is it easy
> > > for a
> > > > > > user
> > > > > > > to
> > > > > > > > > > > reason
> > > > > > > > > > > > about 0.1 unit vs 10%. My feeling is that both are
> hard
> > > and
> > > > > > have
> > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > > configured empirically. Not sure if percentage is
> > > obviously
> > > > > > > easier
> > > > > > > > to
> > > > > > > > > > > > reason about.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jun
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <
> > > > jay@confluent.io
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > > >>
> > > > > > > > > > > >> 1. Even though the implementation of this quota is
> > only
> > > > > using
> > > > > > io
> > > > > > > > > > thread
> > > > > > > > > > > >> time, i think we should call it something like
> > > > > "request-time".
> > > > > > > > This
> > > > > > > > > > will
> > > > > > > > > > > >> give us flexibility to improve the implementation to
> > > cover
> > > > > > > network
> > > > > > > > > > > threads
> > > > > > > > > > > >> in the future and will avoid exposing internal
> details
> > > > like
> > > > > > our
> > > > > > > > > thread
> > > > > > > > > > > >> pools on the server.
> > > > > > > > > > > >>
> > > > > > > > > > > >> 2. Jun/Roger, I get what you are trying to fix but
> the
> > > > idea
> > > > > of
> > > > > > > > > > > >> thread/units
> > > > > > > > > > > >> is super unintuitive as a user-facing knob. I had to
> > > read
> > > > > the
> > > > > > > KIP
> > > > > > > > > like
> > > > > > > > > > > >> eight times to understand this. I'm not sure that
> your
> > > > point
> > > > > > > that
> > > > > > > > > > > >> increasing the number of threads is a problem with a
> > > > > > > > > percentage-based
> > > > > > > > > > > >> value, it really depends on whether the user thinks
> > > about
> > > > > the
> > > > > > > > > > > "percentage
> > > > > > > > > > > >> of request processing time" or "thread units". If
> they
> > > > think
> > > > > > "I
> > > > > > > > have
> > > > > > > > > > > >> allocated 10% of my request processing time to user
> x"
> > > > then
> > > > > it
> > > > > > > is
> > > > > > > > a
> > > > > > > > > > bug
> > > > > > > > > > > >> that increasing the thread count decreases that
> > percent
> > > as
> > > > > it
> > > > > > > does
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > >> current proposal. As a practical matter I think the
> > only
> > > > way
> > > > > > to
> > > > > > > > > > actually
> > > > > > > > > > > >> reason about this is as a percent---I just don't
> > believe
> > > > > > people
> > > > > > > > are
> > > > > > > > > > > going
> > > > > > > > > > > >> to think, "ah, 4.3 thread units, that is the right
> > > > amount!".
> > > > > > > > > Instead I
> > > > > > > > > > > >> think they have to understand this thread unit
> > concept,
> > > > > figure
> > > > > > > out
> > > > > > > > > > what
> > > > > > > > > > > >> they have set in number of threads, compute a
> percent
> > > and
> > > > > then
> > > > > > > > come
> > > > > > > > > up
> > > > > > > > > > > >> with
> > > > > > > > > > > >> the number of thread units, and these will all be
> > wrong
> > > if
> > > > > > that
> > > > > > > > > thread
> > > > > > > > > > > >> count changes. I also think this ties us to
> throttling
> > > the
> > > > > I/O
> > > > > > > > > thread
> > > > > > > > > > > >> pool,
> > > > > > > > > > > >> which may not be where we want to end up.
> > > > > > > > > > > >>
> > > > > > > > > > > >> 3. For what it's worth I do think having a single
> > > > > throttle_ms
> > > > > > > > field
> > > > > > > > > in
> > > > > > > > > > > all
> > > > > > > > > > > >> the responses that combines all throttling from all
> > > quotas
> > > > > is
> > > > > > > > > probably
> > > > > > > > > > > the
> > > > > > > > > > > >> simplest. There could be a use case for having
> > separate
> > > > > fields
> > > > > > > for
> > > > > > > > > > each,
> > > > > > > > > > > >> but I think that is actually harder to use/monitor
> in
> > > the
> > > > > > common
> > > > > > > > > case
> > > > > > > > > > so
> > > > > > > > > > > >> unless someone has a use case I think just one
> should
> > be
> > > > > fine.
> > > > > > > > > > > >>
> > > > > > > > > > > >> -Jay
> > > > > > > > > > > >>
> > > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > > >> wrote:
> > > > > > > > > > > >>
> > > > > > > > > > > >> > I have updated the KIP based on the discussions so
> > > far.
> > > > > > > > > > > >> >
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Regards,
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > Rajini
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > > >> > wrote:
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Ismael #1. It makes sense not to throttle
> > > inter-broker
> > > > > > > > requests
> > > > > > > > > > like
> > > > > > > > > > > >> > > LeaderAndIsr etc. The simplest way to ensure
> that
> > > > > clients
> > > > > > > > cannot
> > > > > > > > > > use
> > > > > > > > > > > >> > these
> > > > > > > > > > > >> > > requests to bypass quotas for DoS attacks is to
> > > ensure
> > > > > > that
> > > > > > > > ACLs
> > > > > > > > > > > >> prevent
> > > > > > > > > > > >> > > clients from using these requests and
> unauthorized
> > > > > > requests
> > > > > > > > are
> > > > > > > > > > > >> included
> > > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking that these
> > quotas
> > > > can
> > > > > > > > return
> > > > > > > > > a
> > > > > > > > > > > >> > separate
> > > > > > > > > > > >> > > throttle time, and all utilization based quotas
> > > could
> > > > > use
> > > > > > > the
> > > > > > > > > same
> > > > > > > > > > > >> field
> > > > > > > > > > > >> > > (we won't add another one for network thread
> > > > utilization
> > > > > > for
> > > > > > > > > > > >> instance).
> > > > > > > > > > > >> > But
> > > > > > > > > > > >> > > perhaps it makes sense to keep byte rate quotas
> > > > separate
> > > > > > in
> > > > > > > > > > > >> produce/fetch
> > > > > > > > > > > >> > > responses to provide separate metrics? Agree
> with
> > > > Ismael
> > > > > > > that
> > > > > > > > > the
> > > > > > > > > > > >> name of
> > > > > > > > > > > >> > > the existing field should be changed if we have
> > two.
> > > > > Happy
> > > > > > > to
> > > > > > > > > > switch
> > > > > > > > > > > >> to a
> > > > > > > > > > > >> > > single combined throttle time if that is
> > sufficient.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot
> > > > > separated
> > > > > > > > name
> > > > > > > > > > for
> > > > > > > > > > > >> new
> > > > > > > > > > > >> > > property. Replication quotas use dot separated,
> so
> > > it
> > > > > will
> > > > > > > be
> > > > > > > > > > > >> consistent
> > > > > > > > > > > >> > > with all properties except byte rate quotas.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Radai: #1 Request processing time rather than
> > > request
> > > > > rate
> > > > > > > > were
> > > > > > > > > > > chosen
> > > > > > > > > > > >> > > because the time per request can vary
> > significantly
> > > > > > between
> > > > > > > > > > requests
> > > > > > > > > > > >> as
> > > > > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > > > > >> > > #2 Two separate quotas for heartbeats/regular
> > > requests
> > > > > > feel
> > > > > > > > like
> > > > > > > > > > > more
> > > > > > > > > > > >> > > configuration and more metrics. Since most users
> > > would
> > > > > set
> > > > > > > > > quotas
> > > > > > > > > > > >> higher
> > > > > > > > > > > >> > > than the expected usage and quotas are more of a
> > > > safety
> > > > > > > net, a
> > > > > > > > > > > single
> > > > > > > > > > > >> > quota
> > > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > > >> > >  #3 The number of requests in purgatory is
> limited
> > > by
> > > > > the
> > > > > > > > number
> > > > > > > > > > of
> > > > > > > > > > > >> > active
> > > > > > > > > > > >> > > connections since only one request per
> connection
> > > will
> > > > > be
> > > > > > > > > > throttled
> > > > > > > > > > > >> at a
> > > > > > > > > > > >> > > time.
> > > > > > > > > > > >> > > #4 As with byte rate quotas, to use the full
> > > allocated
> > > > > > > quotas,
> > > > > > > > > > > >> > > clients/users would need to use partitions that
> > are
> > > > > > > > distributed
> > > > > > > > > > > across
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > > cluster. The alternative of using cluster-wide
> > > quotas
> > > > > > > instead
> > > > > > > > of
> > > > > > > > > > > >> > per-broker
> > > > > > > > > > > >> > > quotas would be far too complex to implement.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Dong : We currently have two ClientQuotaManagers
> > for
> > > > > quota
> > > > > > > > types
> > > > > > > > > > > Fetch
> > > > > > > > > > > >> > and
> > > > > > > > > > > >> > > Produce. A new one will be added for IOThread,
> > which
> > > > > > manages
> > > > > > > > > > quotas
> > > > > > > > > > > >> for
> > > > > > > > > > > >> > I/O
> > > > > > > > > > > >> > > thread utilization. This will not update the
> Fetch
> > > or
> > > > > > > Produce
> > > > > > > > > > > >> queue-size,
> > > > > > > > > > > >> > > but will have a separate metric for the
> > > queue-size.  I
> > > > > > > wasn't
> > > > > > > > > > > >> planning to
> > > > > > > > > > > >> > > add any additional metrics apart from the
> > equivalent
> > > > > ones
> > > > > > > for
> > > > > > > > > > > existing
> > > > > > > > > > > >> > > quotas as part of this KIP. Ratio of byte-rate
> to
> > > I/O
> > > > > > thread
> > > > > > > > > > > >> utilization
> > > > > > > > > > > >> > > could be slightly misleading since it depends on
> > the
> > > > > > > sequence
> > > > > > > > of
> > > > > > > > > > > >> > requests.
> > > > > > > > > > > >> > > But we can look into more metrics after the KIP
> is
> > > > > > > implemented
> > > > > > > > > if
> > > > > > > > > > > >> > required.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > I think we need to limit the maximum delay since
> > all
> > > > > > > requests
> > > > > > > > > are
> > > > > > > > > > > >> > > throttled. If a client has a quota of 0.001
> units
> > > and
> > > > a
> > > > > > > single
> > > > > > > > > > > request
> > > > > > > > > > > >> > used
> > > > > > > > > > > >> > > 50ms, we don't want to delay all requests from
> the
> > > > > client
> > > > > > by
> > > > > > > > 50
> > > > > > > > > > > >> seconds,
> > > > > > > > > > > >> > > throwing the client out of all its consumer
> > groups.
> > > > The
> > > > > > > issue
> > > > > > > > is
> > > > > > > > > > > only
> > > > > > > > > > > >> if
> > > > > > > > > > > >> > a
> > > > > > > > > > > >> > > user is allocated a quota that is insufficient
> to
> > > > > process
> > > > > > > one
> > > > > > > > > > large
> > > > > > > > > > > >> > > request. The expectation is that the units
> > allocated
> > > > per
> > > > > > > user
> > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > >> > much
> > > > > > > > > > > >> > > higher than the time taken to process one
> request
> > > and
> > > > > the
> > > > > > > > limit
> > > > > > > > > > > should
> > > > > > > > > > > >> > > seldom be applied. Agree this needs proper
> > > > > documentation.
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Regards,
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > Rajini
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > > >> > wrote:
> > > > > > > > > > > >> > >
> > > > > > > > > > > >> > >> @jun: i wasnt concerned about tying up a
> request
> > > > > > processing
> > > > > > > > > > thread,
> > > > > > > > > > > >> but
> > > > > > > > > > > >> > >> IIUC the code does still read the entire
> request
> > > out,
> > > > > > which
> > > > > > > > > might
> > > > > > > > > > > >> add-up
> > > > > > > > > > > >> > >> to
> > > > > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > > > > >> > >>
> > > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > >> wrote:
> > > > > > > > > > > >> > >>
> > > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > > >> > >> >
> > > > > > > > > > > >> > >> > The current KIP says that the maximum delay
> > will
> > > be
> > > > > > > reduced
> > > > > > > > > to
> > > > > > > > > > > >> window
> > > > > > > > > > > >> > >> size
> > > > > > > > > > > >> > >> > if it is larger than the window size. I have
> a
> > > > > concern
> > > > > > > with
> > > > > > > > > > this:
> > > > > > > > > > > >> > >> >
> > > > > > > > > > > >> > >> > 1) This essentially means that the user is
> > > allowed
> > > > to
> > > > > > > > exceed
> > > > > > > > > > > their
> > > > > > > > > > > >> > quota
> > > > > > > > > > > >> > >> > over a long period of time. Can you provide
> an
> > > > upper
> > > > > > > bound
> > > > > > > > on
> > > > > > > > > > > this
> > > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > > >> > >> >
> > > > > > > > > > > >> > >> > 2) What is the motivation for cap the maximum
> > > delay
> > > > > by
> > > > > > > the
> > > > > > > > > > window
> > > > > > > > > > > >> > size?
> > > > > > > > > > > >> > >> I
> > > > > > > > > > > >> > >> > am wondering if there is better alternative
> to
> > > > > address
> > > > > > > the
> > > > > > > > > > > problem.
> > > > > > > > > > > >> > >> >
> > > > > > > > > > > >> > >> > 3) It means that the existing metric-related
> > > config
> > > > > > will
> > > > > > > > > have a
> > > > > > > > > > > >> more
> > > > > > > > > > > >> > >> > directly impact on the mechanism of this
> > > > > > > > io-thread-unit-based
> > > > > > > > > > > >> quota.
> > > > > > > > > > > >> > The
> > > > > > > > > > > >> > >> > may be an important change depending on the
> > > answer
> > > > to
> > > > > > 1)
> > > > > > > > > above.
> > > > > > > > > > > We
> > > > > > > > > > > >> > >> probably
> > > > > > > > > > > >> > >> > need to document this more explicitly.
> > > > > > > > > > > >> > >> >
> > > > > > > > > > > >> > >> > Dong
> > > > > > > > > > > >> > >> >
> > > > > > > > > > > >> > >> >
> > > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> > > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > > >> > wrote:
> > > > > > > > > > > >> > >> >
> > > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > > >> > >> > >
> > > > > > > > > > > >> > >> > > Yeah you are right. I thought it wasn't
> > because
> > > > at
> > > > > > > > LinkedIn
> > > > > > > > > > it
> > > > > > > > > > > >> will
> > > > > > > > > > > >> > be
> > > > > > > > > > > >> > >> > too
> > > > > > > > > > > >> > >> > > much pressure on inGraph to expose those
> > > > > per-clientId
> > > > > > > > > metrics
> > > > > > > > > > > so
> > > > > > > > > > > >> we
> > > > > > > > > > > >> > >> ended
> > > > > > > > > > > >> > >> > > up printing them periodically to local log.
> > > Never
> > > > > > mind
> > > > > > > if
> > > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > >> not
> > > > > > > > > > > >> > a
> > > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > > >> > >> > >
> > > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > > >> > >> > >
> > > > > > > > > > > >> > >> > > - I agree with Jay that we probably don't
> > want
> > > to
> > > > > > add a
> > > > > > > > new
> > > > > > > > > > > field
> > > > > > > > > > > >> > for
> > > > > > > > > > > >> > >> > > every quota ProduceResponse or
> FetchResponse.
> > > Is
> > > > > > there
> > > > > > > > any
> > > > > > > > > > > >> use-case
> > > > > > > > > > > >> > >> for
> > > > > > > > > > > >> > >> > > having separate throttle-time fields for
> > > > > > > byte-rate-quota
> > > > > > > > > and
> > > > > > > > > > > >> > >> > > io-thread-unit-quota? You probably need to
> > > > document
> > > > > > > this
> > > > > > > > as
> > > > > > > > > > > >> > interface
> > > > > > > > > > > >> > >> > > change if you plan to add new field in any
> > > > request.
> > > > > > > > > > > >> > >> > >
> > > > > > > > > > > >> > >> > > - I don't think IOThread belongs to
> > quotaType.
> > > > The
> > > > > > > > existing
> > > > > > > > > > > quota
> > > > > > > > > > > >> > >> types
> > > > > > > > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > > > > > > > n/FollowerReplication)
> > > > > > > > > > > >> identify
> > > > > > > > > > > >> > >> the
> > > > > > > > > > > >> > >> > > type of request that are throttled, not the
> > > quota
> > > > > > > > mechanism
> > > > > > > > > > > that
> > > > > > > > > > > >> is
> > > > > > > > > > > >> > >> > applied.
> > > > > > > > > > > >> > >> > >
> > > > > > > > > > > >> > >> > > - If a request is throttled due to this
> > > > > > > > > io-thread-unit-based
> > > > > > > > > > > >> quota,
> > > > > > > > > > > >> > is
> > > > > > > > > > > >> > >> > the
> > > > > > > > > > > >> > >> > > existing queue-size metric in
> > > ClientQuotaManager
> > > > > > > > > incremented?
> > > > > > > > > > > >> > >> > >
> > > > > > > > > > > >> > >> > > - In the interest of providing guide line
> for
> > > > admin
> > > > > > to
> > > > > > > > > decide
> > > > > > > > > > > >> > >> > > io-thread-unit-based quota and for user to
> > > > > understand
> > > > > > > its
> > > > > > > > > > > impact
> > > > > > > > > > > >> on
> > > > > > > > > > > >> > >> their
> > > > > > > > > > > >> > >> > > traffic, would it be useful to have a
> metric
> > > that
> > > > > > shows
> > > > > > > > the
> > > > > > > > > > > >> overall
> > > > > > > > > > > >> > >> > > byte-rate per io-thread-unit? Can we also
> > show
> > > > > this a
> > > > > > > > > > > >> per-clientId
> > > > > > > > > > > >> > >> > metric?
> > > > > > > > > > > >> > >> > >
> > > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > > >> > >> > >
> > > > > > > > > > > >> > >> > >
> > > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <
> > > > > > > > jun@confluent.io
> > > > > > > > > >
> > > > > > > > > > > >> wrote:
> > > > > > > > > > > >> > >> > >
> > > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > >> > >> > >> For #3, typically, an admin won't
> configure
> > > more
> > > > > io
> > > > > > > > > threads
> > > > > > > > > > > than
> > > > > > > > > > > >> > CPU
> > > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > > >> > >> > >> but it's possible for an admin to start
> with
> > > > fewer
> > > > > > io
> > > > > > > > > > threads
> > > > > > > > > > > >> than
> > > > > > > > > > > >> > >> cores
> > > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > >> > >> > >> I think the throttleTime sensor on the
> > broker
> > > > > tells
> > > > > > > the
> > > > > > > > > > admin
> > > > > > > > > > > >> > >> whether a
> > > > > > > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > >> > >> > >> The reasoning for delaying the throttled
> > > > requests
> > > > > on
> > > > > > > the
> > > > > > > > > > > broker
> > > > > > > > > > > >> > >> instead
> > > > > > > > > > > >> > >> > of
> > > > > > > > > > > >> > >> > >> returning an error immediately is that the
> > > > latter
> > > > > > has
> > > > > > > no
> > > > > > > > > way
> > > > > > > > > > > to
> > > > > > > > > > > >> > >> prevent
> > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > >> > >> > >> client from retrying immediately, which
> will
> > > > make
> > > > > > > things
> > > > > > > > > > > worse.
> > > > > > > > > > > >> The
> > > > > > > > > > > >> > >> > >> delaying logic is based off a delay
> queue. A
> > > > > > separate
> > > > > > > > > > > expiration
> > > > > > > > > > > >> > >> thread
> > > > > > > > > > > >> > >> > >> just waits on the next to be expired
> > request.
> > > > So,
> > > > > it
> > > > > > > > > doesn't
> > > > > > > > > > > tie
> > > > > > > > > > > >> > up a
> > > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael
> > Juma <
> > > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > > >> >
> > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > >> > >> > >>
> > > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > >> > >> > >> > Regarding 1, I definitely like the
> > > simplicity
> > > > of
> > > > > > > > > keeping a
> > > > > > > > > > > >> single
> > > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > > >> > >> > >> > time field in the response. The downside
> > is
> > > > that
> > > > > > the
> > > > > > > > > > client
> > > > > > > > > > > >> > metrics
> > > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > > `leader.imbalance.per.broker.
> > > > > > > > > > > percentage`
> > > > > > > > > > > >> > and
> > > > > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay
> > Kreps <
> > > > > > > > > > > jay@confluent.io>
> > > > > > > > > > > >> > >> wrote:
> > > > > > > > > > > >> > >> > >> >
> > > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > >> > >> > >> > >    1. Isn't it the case that the
> > > throttling
> > > > > time
> > > > > > > > > > response
> > > > > > > > > > > >> field
> > > > > > > > > > > >> > >> > should
> > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > >> > >> > >> > >    the total time your request was
> > > throttled
> > > > > > > > > > irrespective
> > > > > > > > > > > of
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > > >> > >> > >> > >    caused that. Limiting it to byte
> rate
> > > > quota
> > > > > > > > doesn't
> > > > > > > > > > > make
> > > > > > > > > > > >> > >> sense,
> > > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > >> > >> > >> > >    I don't think we want to end up
> > adding
> > > > new
> > > > > > > fields
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > >> > >> response
> > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > > > > > > > >> > >> > >> > >    2. I don't think we should make
> this
> > > > quota
> > > > > > > > > > specifically
> > > > > > > > > > > >> > about
> > > > > > > > > > > >> > >> io
> > > > > > > > > > > >> > >> > >> > >    threads. Once we introduce these
> > quotas
> > > > > > people
> > > > > > > > set
> > > > > > > > > > them
> > > > > > > > > > > >> and
> > > > > > > > > > > >> > >> > expect
> > > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > >> > >> > >> > >    be enforced (and if they aren't it
> > may
> > > > > cause
> > > > > > an
> > > > > > > > > > > outage).
> > > > > > > > > > > >> As
> > > > > > > > > > > >> > a
> > > > > > > > > > > >> > >> > >> result
> > > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > > >> > >> > >> > >    are a bit more sensitive than
> normal
> > > > > > configs, I
> > > > > > > > > > think.
> > > > > > > > > > > >> The
> > > > > > > > > > > >> > >> > current
> > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > >> > >> > >> > >    pools seem like something of an
> > > > > > implementation
> > > > > > > > > detail
> > > > > > > > > > > and
> > > > > > > > > > > >> > not
> > > > > > > > > > > >> > >> the
> > > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > >> > >> > >> > >    user-facing quotas should be
> involved
> > > > > with. I
> > > > > > > > think
> > > > > > > > > > it
> > > > > > > > > > > >> might
> > > > > > > > > > > >> > >> be
> > > > > > > > > > > >> > >> > >> better
> > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > >> > >> > >> > >    make this a general request-time
> > > throttle
> > > > > > with
> > > > > > > no
> > > > > > > > > > > >> mention in
> > > > > > > > > > > >> > >> the
> > > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > > >> > >> > >> > >    about I/O threads and simply
> > > acknowledge
> > > > > the
> > > > > > > > > current
> > > > > > > > > > > >> > >> limitation
> > > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > > >> > >> > >> > >    may someday fix) in the docs that
> > this
> > > > > covers
> > > > > > > > only
> > > > > > > > > > the
> > > > > > > > > > > >> time
> > > > > > > > > > > >> > >> after
> > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > >> > >> > >> > >    thread is read off the network.
> > > > > > > > > > > >> > >> > >> > >    3. As such I think the right
> > interface
> > > to
> > > > > the
> > > > > > > > user
> > > > > > > > > > > would
> > > > > > > > > > > >> be
> > > > > > > > > > > >> > >> > >> something
> > > > > > > > > > > >> > >> > >> > >    like percent_request_time and be in
> > > > > > {0,...100}
> > > > > > > or
> > > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio"
> is
> > > the
> > > > > > > > > terminology
> > > > > > > > > > we
> > > > > > > > > > > >> used
> > > > > > > > > > > >> > >> if
> > > > > > > > > > > >> > >> > the
> > > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > > >> > >> > >> > >    is between 0 and 1 in the other
> > > metrics,
> > > > > > > right?)
> > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM,
> Rajini
> > > > > Sivaram
> > > > > > <
> > > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > >> > >> > >> > > > Guozhang : I have updated the
> section
> > on
> > > > > > > > > co-existence
> > > > > > > > > > of
> > > > > > > > > > > >> byte
> > > > > > > > > > > >> > >> rate
> > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added much detail to
> > the
> > > > > > metrics
> > > > > > > > and
> > > > > > > > > > > >> sensors
> > > > > > > > > > > >> > >> since
> > > > > > > > > > > >> > >> > >> they
> > > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > > >> > >> > >> > > > going to be very similar to the
> > existing
> > > > > > metrics
> > > > > > > > and
> > > > > > > > > > > >> sensors.
> > > > > > > > > > > >> > >> To
> > > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > > >> > >> > >> > > > confusion, I have now added more
> > detail.
> > > > All
> > > > > > > > metrics
> > > > > > > > > > are
> > > > > > > > > > > >> in
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > >> > >> group
> > > > > > > > > > > >> > >> > >> > > > "quotaType" and all sensors have
> names
> > > > > > starting
> > > > > > > > with
> > > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/
> > > > > LeaderReplication/
> > > > > > > > > > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > > > > > > > > > >> > >> > >> > > > So there will be no reuse of
> existing
> > > > > > > > > metrics/sensors.
> > > > > > > > > > > The
> > > > > > > > > > > >> > new
> > > > > > > > > > > >> > >> > ones
> > > > > > > > > > > >> > >> > >> for
> > > > > > > > > > > >> > >> > >> > > > request processing time based
> > throttling
> > > > > will
> > > > > > be
> > > > > > > > > > > >> completely
> > > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > >> > >> > >> > > > existing metrics/sensors, but will
> be
> > > > > > consistent
> > > > > > > > in
> > > > > > > > > > > >> format.
> > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > >> > >> > >> > > > The existing throttle_time_ms field
> in
> > > > > > > > produce/fetch
> > > > > > > > > > > >> > responses
> > > > > > > > > > > >> > >> > will
> > > > > > > > > > > >> > >> > >> not
> > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > >> > >> > >> > > > impacted by this KIP. That will
> > continue
> > > > to
> > > > > > > return
> > > > > > > > > > > >> byte-rate
> > > > > > > > > > > >> > >> based
> > > > > > > > > > > >> > >> > >> > > > throttling times. In addition, a new
> > > field
> > > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > >> > >> > >> > > > added to return request quota based
> > > > > throttling
> > > > > > > > > times.
> > > > > > > > > > > >> These
> > > > > > > > > > > >> > >> will
> > > > > > > > > > > >> > >> > be
> > > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > > >> > >> > >> > > > as new metrics on the client-side.
> > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > >> > >> > >> > > > Since all metrics and sensors are
> > > > different
> > > > > > for
> > > > > > > > each
> > > > > > > > > > > type
> > > > > > > > > > > >> of
> > > > > > > > > > > >> > >> > quota,
> > > > > > > > > > > >> > >> > >> I
> > > > > > > > > > > >> > >> > >> > > > believe there is already sufficient
> > > > metrics
> > > > > to
> > > > > > > > > monitor
> > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > >> > >> > on
> > > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > > >> > >> > >> > > > client and broker side for each type
> > of
> > > > > > > > throttling.
> > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM,
> Dong
> > > Lin
> > > > <
> > > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > > >> > >> >
> > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > > I think it makes a lot of sense to
> > use
> > > > > > > > > > io_thread_units
> > > > > > > > > > > >> as
> > > > > > > > > > > >> > >> metric
> > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > >> > >> > >> > > > > user's traffic here. LGTM
> overall. I
> > > > have
> > > > > > some
> > > > > > > > > > > questions
> > > > > > > > > > > >> > >> > regarding
> > > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > > - Can you be more specific in the
> > KIP
> > > > what
> > > > > > > > sensors
> > > > > > > > > > > will
> > > > > > > > > > > >> be
> > > > > > > > > > > >> > >> > added?
> > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > >> > >> > >> > > > > example, it will be useful to
> > specify
> > > > the
> > > > > > name
> > > > > > > > and
> > > > > > > > > > > >> > >> attributes of
> > > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > > - We currently have throttle-time
> > and
> > > > > > > queue-size
> > > > > > > > > for
> > > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > > >> > >> > >> > > > > Are you going to have separate
> > > > > throttle-time
> > > > > > > and
> > > > > > > > > > > >> queue-size
> > > > > > > > > > > >> > >> for
> > > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > > >> > >> > >> > > > > throttled by io_thread_unit-based
> > > quota,
> > > > > or
> > > > > > > will
> > > > > > > > > > they
> > > > > > > > > > > >> share
> > > > > > > > > > > >> > >> the
> > > > > > > > > > > >> > >> > >> same
> > > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > > - Does the throttle-time in the
> > > > > > > ProduceResponse
> > > > > > > > > and
> > > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > > >> > >> > >> > > > > time due to io_thread_unit-based
> > > quota?
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > > - Currently kafka server doesn't
> not
> > > > > provide
> > > > > > > any
> > > > > > > > > log
> > > > > > > > > > > or
> > > > > > > > > > > >> > >> metrics
> > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > > >> > >> > >> > > > > whether any given clientId (or
> user)
> > > is
> > > > > > > > throttled.
> > > > > > > > > > > This
> > > > > > > > > > > >> is
> > > > > > > > > > > >> > >> not
> > > > > > > > > > > >> > >> > too
> > > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > > >> > >> > >> > > > > because we can still check the
> > > > client-side
> > > > > > > > > byte-rate
> > > > > > > > > > > >> metric
> > > > > > > > > > > >> > >> to
> > > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > > >> > >> > >> > > > > whether a given client is
> throttled.
> > > But
> > > > > > with
> > > > > > > > this
> > > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > > >> > >> > >> > > > > will be no way to validate
> whether a
> > > > given
> > > > > > > > client
> > > > > > > > > is
> > > > > > > > > > > >> slow
> > > > > > > > > > > >> > >> > because
> > > > > > > > > > > >> > >> > >> it
> > > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > > >> > >> > >> > > > > exceeded its io_thread_unit limit.
> > It
> > > is
> > > > > > > > necessary
> > > > > > > > > > for
> > > > > > > > > > > >> user
> > > > > > > > > > > >> > >> to
> > > > > > > > > > > >> > >> > be
> > > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > >> > >> > >> > > > > know this information to figure
> how
> > > > > whether
> > > > > > > they
> > > > > > > > > > have
> > > > > > > > > > > >> > reached
> > > > > > > > > > > >> > >> > >> there
> > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > >> > >> > >> > > > > limit. How about we add log4j log
> on
> > > the
> > > > > > > server
> > > > > > > > > side
> > > > > > > > > > > to
> > > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > > byte-rate-throttle-time,
> > > > > > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > > >> > >> > >> > > > > that kafka administrator can
> figure
> > > > those
> > > > > > > users
> > > > > > > > > that
> > > > > > > > > > > >> have
> > > > > > > > > > > >> > >> > reached
> > > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM,
> > > > Guozhang
> > > > > > > Wang <
> > > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >> > >> > >> > > > > > Made a pass over the doc,
> overall
> > > LGTM
> > > > > > > except
> > > > > > > > a
> > > > > > > > > > > minor
> > > > > > > > > > > >> > >> comment
> > > > > > > > > > > >> > >> > on
> > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > Stated as "Request processing
> time
> > > > > > > throttling
> > > > > > > > > will
> > > > > > > > > > > be
> > > > > > > > > > > >> > >> applied
> > > > > > > > > > > >> > >> > on
> > > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > > >> > >> > >> > > > > > necessary." I thought that it
> > meant
> > > > the
> > > > > > > > request
> > > > > > > > > > > >> > processing
> > > > > > > > > > > >> > >> > time
> > > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > > >> > >> > >> > > > > > is applied first, but continue
> > > > reading I
> > > > > > > found
> > > > > > > > > it
> > > > > > > > > > > >> > actually
> > > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > > >> > >> > >> > > > > > produce / fetch byte rate
> > throttling
> > > > > > first.
> > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > Also the last sentence "The
> > > remaining
> > > > > > delay
> > > > > > > if
> > > > > > > > > any
> > > > > > > > > > > is
> > > > > > > > > > > >> > >> applied
> > > > > > > > > > > >> > >> > to
> > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > >> > >> > >> > > > > > response." is a bit confusing to
> > me.
> > > > > Maybe
> > > > > > > > > > rewording
> > > > > > > > > > > >> it a
> > > > > > > > > > > >> > >> bit?
> > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM,
> > Jun
> > > > > Rao <
> > > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > > >> > >> >
> > > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > Thanks for the updated KIP.
> The
> > > > latest
> > > > > > > > > proposal
> > > > > > > > > > > >> looks
> > > > > > > > > > > >> > >> good
> > > > > > > > > > > >> > >> > to
> > > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19
> PM,
> > > > > Rajini
> > > > > > > > > Sivaram
> > > > > > > > > > <
> > > > > > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > Thank you for the feedback.
> > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > 1. I have updated the KIP to
> > use
> > > > > > > absolute
> > > > > > > > > > units
> > > > > > > > > > > >> > >> instead of
> > > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > > >> > >> > >> > > > > > > > property is called*
> > > > io_thread_units*
> > > > > > to
> > > > > > > > > align
> > > > > > > > > > > with
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > > >> > >> > >> > > > > > > > property *num.io.threads*.
> > When
> > > we
> > > > > > > > implement
> > > > > > > > > > > >> network
> > > > > > > > > > > >> > >> > thread
> > > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > > >> > >> > >> > > > > > > > quotas, we can add another
> > > > property
> > > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is
> > already
> > > > > > listed
> > > > > > > > > under
> > > > > > > > > > > the
> > > > > > > > > > > >> > >> exempt
> > > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > > >> > >> > >> > > > > > > > you mean a different request
> > > that
> > > > > > needs
> > > > > > > to
> > > > > > > > > be
> > > > > > > > > > > >> added?
> > > > > > > > > > > >> > >> The
> > > > > > > > > > > >> > >> > >> four
> > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > >> > >> > >> > > > > > > > currently exempt in the KIP
> > are
> > > > > > > > StopReplica,
> > > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and
> > UpdateMetadata.
> > > > > These
> > > > > > > are
> > > > > > > > > > > >> controlled
> > > > > > > > > > > >> > >> > using
> > > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy to
> exclude
> > > and
> > > > > only
> > > > > > > > > > throttle
> > > > > > > > > > > if
> > > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > > >> > >> > >> > > > > > > > sure if there are other
> > requests
> > > > > used
> > > > > > > only
> > > > > > > > > for
> > > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > 3. I was thinking the
> smallest
> > > > > change
> > > > > > > > would
> > > > > > > > > be
> > > > > > > > > > > to
> > > > > > > > > > > >> > >> replace
> > > > > > > > > > > >> > >> > >> all
> > > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > >
> *requestChannel.sendResponse()
> > *
> > > > > with
> > > > > > a
> > > > > > > > > local
> > > > > > > > > > > >> method
> > > > > > > > > > > >> > >> > >> > > > > > > >
> *sendResponseMaybeThrottle()*
> > > that
> > > > > > does
> > > > > > > > the
> > > > > > > > > > > >> > throttling
> > > > > > > > > > > >> > >> if
> > > > > > > > > > > >> > >> > >> any
> > > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > > >> > >> > >> > > > > > > > response. If we throttle
> first
> > > in
> > > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > > >> > >> > the
> > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > > >> > >> > >> > > > > > > > within the method handling
> the
> > > > > request
> > > > > > > > will
> > > > > > > > > > not
> > > > > > > > > > > be
> > > > > > > > > > > >> > >> > recorded
> > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > > >> > >> > >> > > > > > > > throttling. We can look into
> > > this
> > > > > > again
> > > > > > > > when
> > > > > > > > > > the
> > > > > > > > > > > >> PR
> > > > > > > > > > > >> > is
> > > > > > > > > > > >> > >> > ready
> > > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55
> > PM,
> > > > > Roger
> > > > > > > > > Hoover
> > > > > > > > > > <
> > > > > > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > Great to see this KIP and
> > the
> > > > > > > excellent
> > > > > > > > > > > >> discussion.
> > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's suggestion
> > makes
> > > > > sense.
> > > > > > > If
> > > > > > > > > my
> > > > > > > > > > > >> > >> application
> > > > > > > > > > > >> > >> > is
> > > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > > >> > >> > >> > > > > > > > > request handler unit, then
> > > it's
> > > > as
> > > > > > if
> > > > > > > I
> > > > > > > > > > have a
> > > > > > > > > > > >> > Kafka
> > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > > >> > >> > >> > > > > > > > > request handler thread
> > > dedicated
> > > > > to
> > > > > > > me.
> > > > > > > > > > > That's
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > >> > most I
> > > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > > >> > >> > >> > > > > > > > > least.  That allocation
> > > doesn't
> > > > > > change
> > > > > > > > > even
> > > > > > > > > > if
> > > > > > > > > > > >> an
> > > > > > > > > > > >> > >> admin
> > > > > > > > > > > >> > >> > >> later
> > > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > size of the request thread
> > > pool
> > > > on
> > > > > > the
> > > > > > > > > > broker.
> > > > > > > > > > > >> > It's
> > > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > > >> > >> > >> > > > > > > > > abstraction that VMs and
> > > > > containers
> > > > > > > get
> > > > > > > > > from
> > > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > > >> > >> > >> or
> > > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > > >> > >> > >> > > > > > > > > While different client
> > access
> > > > > > patterns
> > > > > > > > can
> > > > > > > > > > use
> > > > > > > > > > > >> > wildly
> > > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > >> > >> > >> > > > > > > > > request thread resources
> per
> > > > > > request,
> > > > > > > a
> > > > > > > > > > given
> > > > > > > > > > > >> > >> > application
> > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > > >> > >> > >> > > > > > > > > have a stable access
> pattern
> > > and
> > > > > can
> > > > > > > > > figure
> > > > > > > > > > > out
> > > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > > >> > >> > >> > > > > > > > > "request thread units" it
> > > needs
> > > > to
> > > > > > > meet
> > > > > > > > > it's
> > > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at
> 8:53
> > > AM,
> > > > > Jun
> > > > > > > > Rao <
> > > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for the updated
> > KIP.
> > > A
> > > > > few
> > > > > > > more
> > > > > > > > > > > >> comments.
> > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> > > > > > request_time_percent
> > > > > > > > is
> > > > > > > > > > that
> > > > > > > > > > > >> it's
> > > > > > > > > > > >> > >> not
> > > > > > > > > > > >> > >> > an
> > > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > Let's say you give a
> user
> > a
> > > > 10%
> > > > > > > limit.
> > > > > > > > > If
> > > > > > > > > > > the
> > > > > > > > > > > >> > admin
> > > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > >> > >> > >> > > > > > > > > > request handler threads,
> > > that
> > > > > user
> > > > > > > now
> > > > > > > > > > > >> actually
> > > > > > > > > > > >> > has
> > > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > > >> > >> > >> > > > > > > > > > capacity. This may
> confuse
> > > > > people
> > > > > > a
> > > > > > > > bit.
> > > > > > > > > > So,
> > > > > > > > > > > >> > >> perhaps
> > > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > > >> > >> > >> > > > > > > > > > based on an absolute
> > request
> > > > > > thread
> > > > > > > > unit
> > > > > > > > > > is
> > > > > > > > > > > >> > better.
> > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> > ControlledShutdownRequest
> > > > is
> > > > > > also
> > > > > > > > an
> > > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > > be excluded from
> > throttling.
> > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > 3. Implementation wise,
> I
> > am
> > > > > > > wondering
> > > > > > > > > if
> > > > > > > > > > > it's
> > > > > > > > > > > >> > >> simpler
> > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > >> > >> > >> > > > > > > > > > time throttling first in
> > > > > > > > > > KafkaApis.handle().
> > > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > > >> > >> > >> we
> > > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > > >> > >> > >> > > > > > > > > > the throttling logic in
> > each
> > > > > type
> > > > > > of
> > > > > > > > > > > request.
> > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at
> > 5:58
> > > > AM,
> > > > > > > > Rajini
> > > > > > > > > > > >> Sivaram <
> > > > > > > > > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank you for the
> > review.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > I have reverted to the
> > > > > original
> > > > > > > KIP
> > > > > > > > > that
> > > > > > > > > > > >> > >> throttles
> > > > > > > > > > > >> > >> > >> based
> > > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > utilization. At the
> > > moment,
> > > > it
> > > > > > > uses
> > > > > > > > > > > >> percentage,
> > > > > > > > > > > >> > >> but
> > > > > > > > > > > >> > >> > I
> > > > > > > > > > > >> > >> > >> am
> > > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > a fraction (out of 1
> > > instead
> > > > > of
> > > > > > > 100)
> > > > > > > > > if
> > > > > > > > > > > >> > >> required. I
> > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > from this discussion
> to
> > > the
> > > > > KIP.
> > > > > > > > Also
> > > > > > > > > > > added
> > > > > > > > > > > >> a
> > > > > > > > > > > >> > >> > "Future
> > > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > address network thread
> > > > > > > utilization.
> > > > > > > > > The
> > > > > > > > > > > >> > >> > configuration
> > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > "request_time_percent"
> > > with
> > > > > the
> > > > > > > > > > > expectation
> > > > > > > > > > > >> > that
> > > > > > > > > > > >> > >> it
> > > > > > > > > > > >> > >> > >> can
> > > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit for network
> thread
> > > > > > > utilization
> > > > > > > > > > when
> > > > > > > > > > > >> that
> > > > > > > > > > > >> > is
> > > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > users have to set only
> > one
> > > > > > config
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > >> two
> > > > > > > > > > > >> > and
> > > > > > > > > > > >> > >> > not
> > > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > the internal
> > distribution
> > > of
> > > > > the
> > > > > > > > work
> > > > > > > > > > > >> between
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > >> > two
> > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017
> at
> > > > 12:23
> > > > > > AM,
> > > > > > > > Jun
> > > > > > > > > > Rao
> > > > > > > > > > > <
> > > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for the
> > proposal.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > The benefit of using
> > the
> > > > > > request
> > > > > > > > > > > >> processing
> > > > > > > > > > > >> > >> time
> > > > > > > > > > > >> > >> > >> over
> > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > exactly what people
> > have
> > > > > > said. I
> > > > > > > > > will
> > > > > > > > > > > just
> > > > > > > > > > > >> > >> expand
> > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > following case. The
> > > > producer
> > > > > > > > sends a
> > > > > > > > > > > >> produce
> > > > > > > > > > > >> > >> > request
> > > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > but compressed to
> > 100KB
> > > > with
> > > > > > > gzip.
> > > > > > > > > The
> > > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > broker could take
> > 10-15
> > > > > > seconds,
> > > > > > > > > > during
> > > > > > > > > > > >> which
> > > > > > > > > > > >> > >> > time,
> > > > > > > > > > > >> > >> > >> a
> > > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > thread is completely
> > > > > blocked.
> > > > > > In
> > > > > > > > > this
> > > > > > > > > > > >> case,
> > > > > > > > > > > >> > >> > neither
> > > > > > > > > > > >> > >> > >> the
> > > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request rate
> quota
> > > may
> > > > > be
> > > > > > > > > > effective
> > > > > > > > > > > in
> > > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > another case. A
> > consumer
> > > > > group
> > > > > > > > > starts
> > > > > > > > > > > >> with 10
> > > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > switches to 20
> > > instances.
> > > > > The
> > > > > > > > > request
> > > > > > > > > > > rate
> > > > > > > > > > > >> > will
> > > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > actually load on the
> > > > broker
> > > > > > may
> > > > > > > > not
> > > > > > > > > > > double
> > > > > > > > > > > >> > >> since
> > > > > > > > > > > >> > >> > >> each
> > > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > contains half of the
> > > > > > partitions.
> > > > > > > > > > Request
> > > > > > > > > > > >> rate
> > > > > > > > > > > >> > >> > quota
> > > > > > > > > > > >> > >> > >> may
> > > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure in this
> > case.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > What we really want
> is
> > > to
> > > > be
> > > > > > > able
> > > > > > > > to
> > > > > > > > > > > >> prevent
> > > > > > > > > > > >> > a
> > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > of the server side
> > > > > resources.
> > > > > > In
> > > > > > > > > this
> > > > > > > > > > > >> > >> particular
> > > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > capacity of the
> > request
> > > > > > handler
> > > > > > > > > > > threads. I
> > > > > > > > > > > >> > >> agree
> > > > > > > > > > > >> > >> > >> that
> > > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > intuitive for the
> > users
> > > to
> > > > > > > > determine
> > > > > > > > > > how
> > > > > > > > > > > >> to
> > > > > > > > > > > >> > set
> > > > > > > > > > > >> > >> > the
> > > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > this is not
> completely
> > > new
> > > > > and
> > > > > > > has
> > > > > > > > > > been
> > > > > > > > > > > >> done
> > > > > > > > > > > >> > in
> > > > > > > > > > > >> > >> > the
> > > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > already. For
> example,
> > > > Linux
> > > > > > > > cgroup (
> > > > > > > > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > Resource_Management_Guide/sec-
> > > > > > > > > > cpu.html)
> > > > > > > > > > > >> has
> > > > > > > > > > > >> > >> the
> > > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > which specifies the
> > > total
> > > > > > amount
> > > > > > > > of
> > > > > > > > > > time
> > > > > > > > > > > >> in
> > > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup
> can
> > > run
> > > > > > > during a
> > > > > > > > > one
> > > > > > > > > > > >> second
> > > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > model the request
> > > handler
> > > > > > > threads
> > > > > > > > > in a
> > > > > > > > > > > >> > similar
> > > > > > > > > > > >> > >> > way.
> > > > > > > > > > > >> > >> > >> For
> > > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > request handler
> thread
> > > can
> > > > > be
> > > > > > 1
> > > > > > > > > > request
> > > > > > > > > > > >> > handler
> > > > > > > > > > > >> > >> > unit
> > > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure a limit on
> > how
> > > > > many
> > > > > > > > units
> > > > > > > > > > (say
> > > > > > > > > > > >> > 0.01)
> > > > > > > > > > > >> > >> a
> > > > > > > > > > > >> > >> > >> client
> > > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Regarding not
> > throttling
> > > > the
> > > > > > > > > internal
> > > > > > > > > > > >> broker
> > > > > > > > > > > >> > to
> > > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > do that.
> > Alternatively,
> > > we
> > > > > > could
> > > > > > > > > just
> > > > > > > > > > > let
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > >> > admin
> > > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for the kafka user
> (it
> > > may
> > > > > not
> > > > > > > be
> > > > > > > > > able
> > > > > > > > > > > to
> > > > > > > > > > > >> do
> > > > > > > > > > > >> > >> that
> > > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we want to
> be
> > > able
> > > > > to
> > > > > > > > > protect
> > > > > > > > > > > the
> > > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool too. The
> > difficult
> > > is
> > > > > > > mostly
> > > > > > > > > what
> > > > > > > > > > > >> Rajini
> > > > > > > > > > > >> > >> > said:
> > > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttling the
> > requests
> > > is
> > > > > > > through
> > > > > > > > > > > >> Purgatory
> > > > > > > > > > > >> > >> and
> > > > > > > > > > > >> > >> > we
> > > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > through how to
> > integrate
> > > > > that
> > > > > > > into
> > > > > > > > > the
> > > > > > > > > > > >> > network
> > > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > layer, currently we
> > know
> > > > the
> > > > > > > user,
> > > > > > > > > but
> > > > > > > > > > > not
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit tricky to
> > > > > throttle
> > > > > > > > based
> > > > > > > > > on
> > > > > > > > > > > >> > clientId
> > > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota can already
> > > protect
> > > > > the
> > > > > > > > > network
> > > > > > > > > > > >> thread
> > > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > requests. So, if we
> > > can't
> > > > > > figure
> > > > > > > > out
> > > > > > > > > > > this
> > > > > > > > > > > >> > part
> > > > > > > > > > > >> > >> > right
> > > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request handling
> > > > threads
> > > > > > for
> > > > > > > > > this
> > > > > > > > > > > KIP
> > > > > > > > > > > >> is
> > > > > > > > > > > >> > >> > still a
> > > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017
> > at
> > > > 4:27
> > > > > > AM,
> > > > > > > > > > Rajini
> > > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank you all for
> > the
> > > > > > > feedback.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have
> removed
> > > > > > exemption
> > > > > > > > for
> > > > > > > > > > > >> consumer
> > > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > protecting the
> > cluster
> > > > is
> > > > > > more
> > > > > > > > > > > important
> > > > > > > > > > > >> > than
> > > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have retained the
> > > > > exemption
> > > > > > > for
> > > > > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled only if
> > > > > > > authorization
> > > > > > > > > > fails
> > > > > > > > > > > >> (so
> > > > > > > > > > > >> > >> can't
> > > > > > > > > > > >> > >> > be
> > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a secure cluster,
> > but
> > > > > allows
> > > > > > > > > > > >> inter-broker
> > > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I will wait
> another
> > > day
> > > > to
> > > > > > see
> > > > > > > > if
> > > > > > > > > > > these
> > > > > > > > > > > >> is
> > > > > > > > > > > >> > >> any
> > > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > request processing
> > > time
> > > > > (as
> > > > > > > > > opposed
> > > > > > > > > > to
> > > > > > > > > > > >> > >> request
> > > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > objections, I will
> > > > revert
> > > > > to
> > > > > > > the
> > > > > > > > > > > >> original
> > > > > > > > > > > >> > >> > proposal
> > > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > The original
> > proposal
> > > > was
> > > > > > only
> > > > > > > > > > > including
> > > > > > > > > > > >> > the
> > > > > > > > > > > >> > >> > time
> > > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handler threads
> > (that
> > > > made
> > > > > > > > > > calculation
> > > > > > > > > > > >> > >> easy). I
> > > > > > > > > > > >> > >> > >> think
> > > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > include the time
> > spent
> > > > in
> > > > > > the
> > > > > > > > > > network
> > > > > > > > > > > >> > >> threads as
> > > > > > > > > > > >> > >> > >> well
> > > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant. As
> Jay
> > > > > pointed
> > > > > > > out,
> > > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > >> more
> > > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > total available
> CPU
> > > time
> > > > > and
> > > > > > > > > convert
> > > > > > > > > > > to
> > > > > > > > > > > >> a
> > > > > > > > > > > >> > >> ratio
> > > > > > > > > > > >> > >> > >> when
> > > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and *n* network
> > > threads.
> > > > > > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we want, but it
> can
> > be
> > > > > very
> > > > > > > > > > expensive
> > > > > > > > > > > on
> > > > > > > > > > > >> > some
> > > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang have
> > pointed
> > > > out,
> > > > > > we
> > > > > > > do
> > > > > > > > > > have
> > > > > > > > > > > >> > several
> > > > > > > > > > > >> > >> > time
> > > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > generating metrics
> > > that
> > > > we
> > > > > > > could
> > > > > > > > > > use,
> > > > > > > > > > > >> > though
> > > > > > > > > > > >> > >> we
> > > > > > > > > > > >> > >> > >> might
> > > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime() instead
> > of
> > > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > > >> > >> since
> > > > > > > > > > > >> > >> > >> some
> > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > small requests may
> > be
> > > <
> > > > > 1ms.
> > > > > > > But
> > > > > > > > > > > rather
> > > > > > > > > > > >> > than
> > > > > > > > > > > >> > >> add
> > > > > > > > > > > >> > >> > >> up
> > > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thread and network
> > > > thread,
> > > > > > > > > wouldn't
> > > > > > > > > > it
> > > > > > > > > > > >> be
> > > > > > > > > > > >> > >> better
> > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on each thread
> into
> > a
> > > > > > separate
> > > > > > > > > > ratio?
> > > > > > > > > > > >> UserA
> > > > > > > > > > > >> > >> has
> > > > > > > > > > > >> > >> > a
> > > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we take that to
> mean
> > > > that
> > > > > > > UserA
> > > > > > > > > can
> > > > > > > > > > > use
> > > > > > > > > > > >> 5%
> > > > > > > > > > > >> > of
> > > > > > > > > > > >> > >> > the
> > > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of the time
> > on
> > > > I/O
> > > > > > > > threads?
> > > > > > > > > > If
> > > > > > > > > > > >> > either
> > > > > > > > > > > >> > >> is
> > > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled - it
> would
> > > > mean
> > > > > > > > > > maintaining
> > > > > > > > > > > >> two
> > > > > > > > > > > >> > >> sets
> > > > > > > > > > > >> > >> > of
> > > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > durations, but
> would
> > > > > result
> > > > > > in
> > > > > > > > > more
> > > > > > > > > > > >> > >> meaningful
> > > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quota limits
> (UserA
> > > has
> > > > 5%
> > > > > > of
> > > > > > > > > > request
> > > > > > > > > > > >> > threads
> > > > > > > > > > > >> > >> > and
> > > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > but that seems
> > > > unnecessary
> > > > > > and
> > > > > > > > > > harder
> > > > > > > > > > > to
> > > > > > > > > > > >> > >> explain
> > > > > > > > > > > >> > >> > >> to
> > > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back to why and
> how
> > > > quotas
> > > > > > are
> > > > > > > > > > applied
> > > > > > > > > > > >> to
> > > > > > > > > > > >> > >> > network
> > > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In the case of
> > > fetch,
> > > > > > the
> > > > > > > > time
> > > > > > > > > > > >> spent in
> > > > > > > > > > > >> > >> the
> > > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant and I
> > can
> > > > see
> > > > > > the
> > > > > > > > need
> > > > > > > > > > to
> > > > > > > > > > > >> > include
> > > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > requests where the
> > > > network
> > > > > > > > thread
> > > > > > > > > > > >> > >> utilization is
> > > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of fetch, request
> > > > handler
> > > > > > > thread
> > > > > > > > > > > >> > utilization
> > > > > > > > > > > >> > >> > would
> > > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > high request rate,
> > low
> > > > > data
> > > > > > > > volume
> > > > > > > > > > and
> > > > > > > > > > > >> > fetch
> > > > > > > > > > > >> > >> > byte
> > > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > clients with high
> > data
> > > > > > volume.
> > > > > > > > > > Network
> > > > > > > > > > > >> > thread
> > > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > proportional to
> the
> > > data
> > > > > > > > volume. I
> > > > > > > > > > am
> > > > > > > > > > > >> > >> wondering
> > > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > based on network
> > > thread
> > > > > > > > > utilization
> > > > > > > > > > or
> > > > > > > > > > > >> > >> whether
> > > > > > > > > > > >> > >> > the
> > > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At the moment,
> we
> > > > > record
> > > > > > > and
> > > > > > > > > > check
> > > > > > > > > > > >> for
> > > > > > > > > > > >> > >> quota
> > > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a quota is
> > > violated,
> > > > > the
> > > > > > > > > response
> > > > > > > > > > > is
> > > > > > > > > > > >> > >> delayed.
> > > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk reads for
> > fetches
> > > > > > > happening
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > >> > >> network
> > > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delay a response
> > after
> > > > the
> > > > > > > disk
> > > > > > > > > > reads.
> > > > > > > > > > > >> We
> > > > > > > > > > > >> > >> could
> > > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > the network thread
> > > when
> > > > > the
> > > > > > > > > response
> > > > > > > > > > > is
> > > > > > > > > > > >> > >> complete
> > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling a
> > subsequent
> > > > > > request
> > > > > > > > > > > (separate
> > > > > > > > > > > >> out
> > > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling in the
> case
> > > of
> > > > > > > network
> > > > > > > > > > thread
> > > > > > > > > > > >> > >> > overload).
> > > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21,
> 2017
> > > at
> > > > > 2:58
> > > > > > > AM,
> > > > > > > > > > > Becket
> > > > > > > > > > > >> > Qin <
> > > > > > > > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree
> that
> > > > > > enforcing
> > > > > > > > the
> > > > > > > > > > CPU
> > > > > > > > > > > >> time
> > > > > > > > > > > >> > >> is a
> > > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > that maybe we
> can
> > > use
> > > > > the
> > > > > > > > > existing
> > > > > > > > > > > >> > request
> > > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > very detailed so
> > we
> > > > can
> > > > > > > > probably
> > > > > > > > > > see
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > something like
> > > > > > (total_time -
> > > > > > > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I agree with
> > > Guozhang
> > > > > that
> > > > > > > > when
> > > > > > > > > a
> > > > > > > > > > > >> user is
> > > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > need to see if
> > > > anything
> > > > > > has
> > > > > > > > went
> > > > > > > > > > > wrong
> > > > > > > > > > > >> > >> first,
> > > > > > > > > > > >> > >> > >> and
> > > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > behaving and
> just
> > > need
> > > > > > more
> > > > > > > > > > > >> resources, we
> > > > > > > > > > > >> > >> will
> > > > > > > > > > > >> > >> > >> have
> > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for them. It is
> > true
> > > > > that
> > > > > > > > > > > >> pre-allocating
> > > > > > > > > > > >> > >> CPU
> > > > > > > > > > > >> > >> > >> time
> > > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > users is
> > difficult.
> > > So
> > > > > in
> > > > > > > > > practice
> > > > > > > > > > > it
> > > > > > > > > > > >> > would
> > > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a relative high
> > > > > protective
> > > > > > > CPU
> > > > > > > > > > time
> > > > > > > > > > > >> quota
> > > > > > > > > > > >> > >> for
> > > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for some
> > individual
> > > > > > clients
> > > > > > > on
> > > > > > > > > > > demand.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie
> (Becket)
> > > Qin
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20,
> > 2017
> > > > at
> > > > > > 5:48
> > > > > > > > PM,
> > > > > > > > > > > >> Guozhang
> > > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > This is a
> great
> > > > > > proposal,
> > > > > > > > glad
> > > > > > > > > > to
> > > > > > > > > > > >> see
> > > > > > > > > > > >> > it
> > > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am inclined
> to
> > > the
> > > > > CPU
> > > > > > > > > > > >> throttling, or
> > > > > > > > > > > >> > >> more
> > > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio instead
> of
> > > the
> > > > > > > request
> > > > > > > > > > rate
> > > > > > > > > > > >> > >> throttling
> > > > > > > > > > > >> > >> > >> as
> > > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > summed my
> > > rationales
> > > > > > > above,
> > > > > > > > > and
> > > > > > > > > > > one
> > > > > > > > > > > >> > >> thing to
> > > > > > > > > > > >> > >> > >> add
> > > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > has a good
> > support
> > > > for
> > > > > > > both
> > > > > > > > > > > >> "protecting
> > > > > > > > > > > >> > >> > >> against
> > > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a
> > > cluster
> > > > > for
> > > > > > > > > > > >> multi-tenancy
> > > > > > > > > > > >> > >> > usage":
> > > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > explain this
> to
> > > the
> > > > > end
> > > > > > > > > users, I
> > > > > > > > > > > >> find
> > > > > > > > > > > >> > it
> > > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > request rate
> > since
> > > > as
> > > > > > > > > mentioned
> > > > > > > > > > > >> above,
> > > > > > > > > > > >> > >> > >> different
> > > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > different
> > "cost",
> > > > and
> > > > > > > Kafka
> > > > > > > > > > today
> > > > > > > > > > > >> > already
> > > > > > > > > > > >> > >> > have
> > > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > (produce,
> fetch,
> > > > > admin,
> > > > > > > > > > metadata,
> > > > > > > > > > > >> etc),
> > > > > > > > > > > >> > >> > >> because
> > > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttling may
> > not
> > > > be
> > > > > as
> > > > > > > > > > effective
> > > > > > > > > > > >> > >> unless it
> > > > > > > > > > > >> > >> > >> is
> > > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > Regarding to
> > user
> > > > > > > reactions
> > > > > > > > > when
> > > > > > > > > > > >> they
> > > > > > > > > > > >> > are
> > > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > case-by-case,
> > and
> > > > need
> > > > > > to
> > > > > > > be
> > > > > > > > > > > >> > discovered /
> > > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > metrics. So in
> > > other
> > > > > > words
> > > > > > > > > users
> > > > > > > > > > > >> would
> > > > > > > > > > > >> > >> not
> > > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > information by
> > > > simply
> > > > > > > being
> > > > > > > > > told
> > > > > > > > > > > >> "hey,
> > > > > > > > > > > >> > >> you
> > > > > > > > > > > >> > >> > are
> > > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > what
> throttling
> > > > does;
> > > > > > they
> > > > > > > > > need
> > > > > > > > > > to
> > > > > > > > > > > >> > take a
> > > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttled
> > probably
> > > > > > because
> > > > > > > > of
> > > > > > > > > > ..",
> > > > > > > > > > > >> > which
> > > > > > > > > > > >> > >> is
> > > > > > > > > > > >> > >> > by
> > > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > values: e.g.
> > > whether
> > > > > I'm
> > > > > > > > > > > bombarding
> > > > > > > > > > > >> the
> > > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > > >>
> > > > > > > > > > > > ...
> > > > > > > > > > > >
> > > > > > > > > > > > [Message clipped]
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > *Todd Palino*
> > Staff Site Reliability Engineer
> > Data Infrastructure Streaming
> >
> >
> >
> > linkedin.com/in/toddpalino
> >
>



-- 
*Todd Palino*
Staff Site Reliability Engineer
Data Infrastructure Streaming



linkedin.com/in/toddpalino

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Todd,

Thanks for the feedback.

I just want to clarify your second point. If the limit percentage is per
thread and the thread counts are changed, the absolute processing limit for
existing users haven't changed and there is no need to adjust them. On the
other hand, if the limit percentage is of total thread pool capacity and
the thread counts are changed, the effective processing limit for a user
will change. So, to preserve the current processing limit, existing user
limits have to be adjusted. If there is a hardware change, the effective
processing limit for a user will change in either approach and the existing
limit may need to be adjusted. However, hardware changes are less common
than thread pool configuration changes.

Thanks,

Jun

On Tue, Mar 7, 2017 at 4:45 PM, Todd Palino <tp...@gmail.com> wrote:

> I’ve been following this one on and off, and overall it sounds good to me.
>
> - The SSL question is a good one. However, that type of overhead should be
> proportional to the bytes rate, so I think that a bytes rate quota would
> still be a suitable way to address it.
>
> - I think it’s better to make the quota percentage of total thread pool
> capacity, and not percentage of an individual thread. That way you don’t
> have to adjust it when you adjust thread counts (tuning, hardware changes,
> etc.)
>
>
> -Todd
>
>
>
> On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <be...@gmail.com> wrote:
>
> > I see. Good point about SSL.
> >
> > I just asked Todd to take a look.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Jiangjie,
> > >
> > > Yes, I agree that byte rate already protects the network threads
> > > indirectly. I am not sure if byte rate fully captures the CPU overhead
> in
> > > network due to SSL. So, at the high level, we can use request time
> limit
> > to
> > > protect CPU and use byte rate to protect storage and network.
> > >
> > > Also, do you think you can get Todd to comment on this KIP?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <be...@gmail.com>
> > wrote:
> > >
> > > > Hi Rajini/Jun,
> > > >
> > > > The percentage based reasoning sounds good.
> > > > One thing I am wondering is that if we assume the network thread are
> > just
> > > > doing the network IO, can we say bytes rate quota is already sort of
> > > > network threads quota?
> > > > If we take network threads into the consideration here, would that be
> > > > somewhat overlapping with the bytes rate quota?
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > rajinisivaram@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Jun,
> > > > >
> > > > > Thank you for the explanation, I hadn't realized you meant
> percentage
> > > of
> > > > > the total thread pool. If everyone is OK with Jun's suggestion, I
> > will
> > > > > update the KIP.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Rajini
> > > > >
> > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Rajini,
> > > > > >
> > > > > > Let's take your example. Let's say a user sets the limit to 50%.
> I
> > am
> > > > not
> > > > > > sure if it's better to apply the same percentage separately to
> > > network
> > > > > and
> > > > > > io thread pool. For example, for produce requests, most of the
> time
> > > > will
> > > > > be
> > > > > > spent in the io threads whereas for fetch requests, most of the
> > time
> > > > will
> > > > > > be in the network threads. So, using the same percentage in both
> > > thread
> > > > > > pools means one of the pools' resource will be over allocated.
> > > > > >
> > > > > > An alternative way is to simply model network and io thread pool
> > > > > together.
> > > > > > If you get 10 io threads and 5 network threads, you get 1500%
> > request
> > > > > > processing power. A 50% limit means a total of 750% processing
> > power.
> > > > We
> > > > > > just add up the time a user request spent in either network or io
> > > > thread.
> > > > > > If that total exceeds 750% (doesn't matter whether it's spent
> more
> > in
> > > > > > network or io thread), the request will be throttled. This seems
> > more
> > > > > > general and is not sensitive to the current implementation detail
> > of
> > > > > having
> > > > > > a separate network and io thread pool. In the future, if the
> > > threading
> > > > > > model changes, the same concept of quota can still be applied.
> For
> > > now,
> > > > > > since it's a bit tricky to add the delay logic in the network
> > thread
> > > > > pool,
> > > > > > we could probably just do the delaying only in the io threads as
> > you
> > > > > > suggested earlier.
> > > > > >
> > > > > > There is still the orthogonal question of whether a quota of 50%
> is
> > > out
> > > > > of
> > > > > > 100% or 100% * #total processing threads. My feeling is that the
> > > latter
> > > > > is
> > > > > > slightly better based on my explanation earlier. The way to
> > describe
> > > > this
> > > > > > quota to the users can be "share of elapsed request processing
> time
> > > on
> > > > a
> > > > > > single CPU" (similar to top).
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Jun,
> > > > > > >
> > > > > > > Agree about the two scenarios.
> > > > > > >
> > > > > > > But still not sure about a single quota covering both network
> > > threads
> > > > > and
> > > > > > > I/O threads with per-thread quota. If there are 10 I/O threads
> > and
> > > 5
> > > > > > > network threads and I want to assign half the quota to userA,
> the
> > > > quota
> > > > > > > would be 750%. I imagine, internally, we would convert this to
> > 500%
> > > > for
> > > > > > I/O
> > > > > > > and 250% for network threads to allocate 50% of each pool.
> > > > > > >
> > > > > > > A couple of scenarios:
> > > > > > >
> > > > > > > 1. Admin adds 1 extra network thread. To retain 50%, admin
> needs
> > to
> > > > now
> > > > > > > allocate 800% for each user. Or increase the quota for a few
> > users.
> > > > To
> > > > > > me,
> > > > > > > it feels like admin needs to convert 50% to 800% and Kafka
> > > internally
> > > > > > needs
> > > > > > > to convert 800% to (500%, 300%). Everyone using just 50% feels
> a
> > > lot
> > > > > > > simpler.
> > > > > > >
> > > > > > > 2. We decide to add some other thread to this list. Admin needs
> > to
> > > > know
> > > > > > > exactly how many threads form the maximum quota. And we can be
> > > > changing
> > > > > > > this between broker versions as we add more to the list. Again
> a
> > > > single
> > > > > > > overall percent would be a lot simpler.
> > > > > > >
> > > > > > > There were others who were unconvinced by a single percent from
> > the
> > > > > > initial
> > > > > > > proposal and were happier with thread units similar to CPU
> units,
> > > so
> > > > I
> > > > > am
> > > > > > > ok with going with per-thread quotas (as units or percent).
> Just
> > > not
> > > > > sure
> > > > > > > it makes it easier for admin in all cases.
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > Rajini
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > > > > >
> > > > > > > > Hi, Rajini,
> > > > > > > >
> > > > > > > > Consider modeling as n * 100% unit. For 2), the question is
> > > what's
> > > > > > > causing
> > > > > > > > the I/O threads to be saturated. It's unlikely that all
> users'
> > > > > > > utilization
> > > > > > > > have increased at the same. A more likely case is that a few
> > > > isolated
> > > > > > > > users' utilization have increased. If so, after increasing
> the
> > > > number
> > > > > > of
> > > > > > > > threads, the admin just needs to adjust the quota for a few
> > > > isolated
> > > > > > > users,
> > > > > > > > which is expected and is less work.
> > > > > > > >
> > > > > > > > Consider modeling as 1 * 100% unit. For 1), all users' quota
> > need
> > > > to
> > > > > be
> > > > > > > > adjusted, which is unexpected and is more work.
> > > > > > > >
> > > > > > > > So, to me, the n * 100% model seems more convenient.
> > > > > > > >
> > > > > > > > As for future extension to cover network thread utilization,
> I
> > > was
> > > > > > > thinking
> > > > > > > > that one way is to simply model the capacity as (n + m) *
> 100%
> > > > unit,
> > > > > > > where
> > > > > > > > n and m are the number of network and i/o threads,
> > respectively.
> > > > > Then,
> > > > > > > for
> > > > > > > > each user, we can just add up the utilization in the network
> > and
> > > > the
> > > > > > i/o
> > > > > > > > thread. If we do this, we don't need a new type of quota.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > > > > rajinisivaram@gmail.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Jun,
> > > > > > > > >
> > > > > > > > > If we use request.percentage as the percentage used in a
> > single
> > > > I/O
> > > > > > > > thread,
> > > > > > > > > the total percentage being allocated will be
> num.io.threads *
> > > 100
> > > > > for
> > > > > > > I/O
> > > > > > > > > threads and num.network.threads * 100 for network threads.
> A
> > > > single
> > > > > > > quota
> > > > > > > > > covering the two as a percentage wouldn't quite work if you
> > > want
> > > > to
> > > > > > > > > allocate the same proportion in both cases. If we want to
> > treat
> > > > > > threads
> > > > > > > > as
> > > > > > > > > separate units, won't we need two quota configurations
> > > regardless
> > > > > of
> > > > > > > > > whether we use units or percentage? Perhaps I misunderstood
> > > your
> > > > > > > > > suggestion.
> > > > > > > > >
> > > > > > > > > I think there are two cases:
> > > > > > > > >
> > > > > > > > >    1. The use case that you mentioned where an admin is
> > adding
> > > > more
> > > > > > > users
> > > > > > > > >    and decides to add more I/O threads and expects to find
> > free
> > > > > quota
> > > > > > > to
> > > > > > > > >    allocate for new users.
> > > > > > > > >    2. Admin adds more I/O threads because the I/O threads
> are
> > > > > > saturated
> > > > > > > > and
> > > > > > > > >    there are cores available to allocate, even though the
> > > number
> > > > or
> > > > > > > > >    users/clients hasn't changed.
> > > > > > > > >
> > > > > > > > > If we allocated treated I/O threads as a single unit of
> 100%,
> > > all
> > > > > > user
> > > > > > > > > quotas need to be reallocated for 1). If we allocated I/O
> > > threads
> > > > > as
> > > > > > n
> > > > > > > > > units with n*100%, all user quotas need to be reallocated
> for
> > > 2),
> > > > > > > > otherwise
> > > > > > > > > some of the new threads may just not be used. Either way it
> > > > should
> > > > > be
> > > > > > > > easy
> > > > > > > > > to write a script to decrease/increase quotas by a multiple
> > for
> > > > all
> > > > > > > > users.
> > > > > > > > >
> > > > > > > > > So it really boils down to which quota unit is most
> intuitive
> > > in
> > > > > > terms
> > > > > > > of
> > > > > > > > > configuration. And from the discussion so far, it feels
> like
> > > > > opinion
> > > > > > is
> > > > > > > > > divided on whether quotas should be carved out of an
> absolute
> > > > 100%
> > > > > > (or
> > > > > > > 1
> > > > > > > > > unit) or be relative to the number of threads (n*100% or n
> > > > units).
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Another way to express an absolute limit is to use
> > > > > > > request.percentage,
> > > > > > > > > but
> > > > > > > > > > treat it as the percentage used in a single request
> > handling
> > > > > > thread.
> > > > > > > > For
> > > > > > > > > > now, the request handling threads can be just the io
> > threads.
> > > > In
> > > > > > the
> > > > > > > > > > future, they can cover the network threads as well. This
> is
> > > > > similar
> > > > > > > to
> > > > > > > > > how
> > > > > > > > > > top reports CPU usage and may be a bit easier for people
> to
> > > > > > > understand.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <
> > jun@confluent.io>
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, Jay,
> > > > > > > > > > >
> > > > > > > > > > > 2. Regarding request.unit vs request.percentage. I
> > started
> > > > with
> > > > > > > > > > > request.percentage too. The reasoning for request.unit
> is
> > > the
> > > > > > > > > following.
> > > > > > > > > > > Suppose that the capacity has been reached on a broker
> > and
> > > > the
> > > > > > > admin
> > > > > > > > > > needs
> > > > > > > > > > > to add a new user. A simple way to increase the
> capacity
> > is
> > > > to
> > > > > > > > increase
> > > > > > > > > > the
> > > > > > > > > > > number of io threads, assuming there are still enough
> > > cores.
> > > > If
> > > > > > the
> > > > > > > > > limit
> > > > > > > > > > > is based on percentage, the additional capacity
> > > automatically
> > > > > > gets
> > > > > > > > > > > distributed to existing users and we haven't really
> > carved
> > > > out
> > > > > > any
> > > > > > > > > > > additional resource for the new user. Now, is it easy
> > for a
> > > > > user
> > > > > > to
> > > > > > > > > > reason
> > > > > > > > > > > about 0.1 unit vs 10%. My feeling is that both are hard
> > and
> > > > > have
> > > > > > to
> > > > > > > > be
> > > > > > > > > > > configured empirically. Not sure if percentage is
> > obviously
> > > > > > easier
> > > > > > > to
> > > > > > > > > > > reason about.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <
> > > jay@confluent.io
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > >>
> > > > > > > > > > >> 1. Even though the implementation of this quota is
> only
> > > > using
> > > > > io
> > > > > > > > > thread
> > > > > > > > > > >> time, i think we should call it something like
> > > > "request-time".
> > > > > > > This
> > > > > > > > > will
> > > > > > > > > > >> give us flexibility to improve the implementation to
> > cover
> > > > > > network
> > > > > > > > > > threads
> > > > > > > > > > >> in the future and will avoid exposing internal details
> > > like
> > > > > our
> > > > > > > > thread
> > > > > > > > > > >> pools on the server.
> > > > > > > > > > >>
> > > > > > > > > > >> 2. Jun/Roger, I get what you are trying to fix but the
> > > idea
> > > > of
> > > > > > > > > > >> thread/units
> > > > > > > > > > >> is super unintuitive as a user-facing knob. I had to
> > read
> > > > the
> > > > > > KIP
> > > > > > > > like
> > > > > > > > > > >> eight times to understand this. I'm not sure that your
> > > point
> > > > > > that
> > > > > > > > > > >> increasing the number of threads is a problem with a
> > > > > > > > percentage-based
> > > > > > > > > > >> value, it really depends on whether the user thinks
> > about
> > > > the
> > > > > > > > > > "percentage
> > > > > > > > > > >> of request processing time" or "thread units". If they
> > > think
> > > > > "I
> > > > > > > have
> > > > > > > > > > >> allocated 10% of my request processing time to user x"
> > > then
> > > > it
> > > > > > is
> > > > > > > a
> > > > > > > > > bug
> > > > > > > > > > >> that increasing the thread count decreases that
> percent
> > as
> > > > it
> > > > > > does
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > >> current proposal. As a practical matter I think the
> only
> > > way
> > > > > to
> > > > > > > > > actually
> > > > > > > > > > >> reason about this is as a percent---I just don't
> believe
> > > > > people
> > > > > > > are
> > > > > > > > > > going
> > > > > > > > > > >> to think, "ah, 4.3 thread units, that is the right
> > > amount!".
> > > > > > > > Instead I
> > > > > > > > > > >> think they have to understand this thread unit
> concept,
> > > > figure
> > > > > > out
> > > > > > > > > what
> > > > > > > > > > >> they have set in number of threads, compute a percent
> > and
> > > > then
> > > > > > > come
> > > > > > > > up
> > > > > > > > > > >> with
> > > > > > > > > > >> the number of thread units, and these will all be
> wrong
> > if
> > > > > that
> > > > > > > > thread
> > > > > > > > > > >> count changes. I also think this ties us to throttling
> > the
> > > > I/O
> > > > > > > > thread
> > > > > > > > > > >> pool,
> > > > > > > > > > >> which may not be where we want to end up.
> > > > > > > > > > >>
> > > > > > > > > > >> 3. For what it's worth I do think having a single
> > > > throttle_ms
> > > > > > > field
> > > > > > > > in
> > > > > > > > > > all
> > > > > > > > > > >> the responses that combines all throttling from all
> > quotas
> > > > is
> > > > > > > > probably
> > > > > > > > > > the
> > > > > > > > > > >> simplest. There could be a use case for having
> separate
> > > > fields
> > > > > > for
> > > > > > > > > each,
> > > > > > > > > > >> but I think that is actually harder to use/monitor in
> > the
> > > > > common
> > > > > > > > case
> > > > > > > > > so
> > > > > > > > > > >> unless someone has a use case I think just one should
> be
> > > > fine.
> > > > > > > > > > >>
> > > > > > > > > > >> -Jay
> > > > > > > > > > >>
> > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > >> wrote:
> > > > > > > > > > >>
> > > > > > > > > > >> > I have updated the KIP based on the discussions so
> > far.
> > > > > > > > > > >> >
> > > > > > > > > > >> >
> > > > > > > > > > >> > Regards,
> > > > > > > > > > >> >
> > > > > > > > > > >> > Rajini
> > > > > > > > > > >> >
> > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > >> > wrote:
> > > > > > > > > > >> >
> > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Ismael #1. It makes sense not to throttle
> > inter-broker
> > > > > > > requests
> > > > > > > > > like
> > > > > > > > > > >> > > LeaderAndIsr etc. The simplest way to ensure that
> > > > clients
> > > > > > > cannot
> > > > > > > > > use
> > > > > > > > > > >> > these
> > > > > > > > > > >> > > requests to bypass quotas for DoS attacks is to
> > ensure
> > > > > that
> > > > > > > ACLs
> > > > > > > > > > >> prevent
> > > > > > > > > > >> > > clients from using these requests and unauthorized
> > > > > requests
> > > > > > > are
> > > > > > > > > > >> included
> > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking that these
> quotas
> > > can
> > > > > > > return
> > > > > > > > a
> > > > > > > > > > >> > separate
> > > > > > > > > > >> > > throttle time, and all utilization based quotas
> > could
> > > > use
> > > > > > the
> > > > > > > > same
> > > > > > > > > > >> field
> > > > > > > > > > >> > > (we won't add another one for network thread
> > > utilization
> > > > > for
> > > > > > > > > > >> instance).
> > > > > > > > > > >> > But
> > > > > > > > > > >> > > perhaps it makes sense to keep byte rate quotas
> > > separate
> > > > > in
> > > > > > > > > > >> produce/fetch
> > > > > > > > > > >> > > responses to provide separate metrics? Agree with
> > > Ismael
> > > > > > that
> > > > > > > > the
> > > > > > > > > > >> name of
> > > > > > > > > > >> > > the existing field should be changed if we have
> two.
> > > > Happy
> > > > > > to
> > > > > > > > > switch
> > > > > > > > > > >> to a
> > > > > > > > > > >> > > single combined throttle time if that is
> sufficient.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot
> > > > separated
> > > > > > > name
> > > > > > > > > for
> > > > > > > > > > >> new
> > > > > > > > > > >> > > property. Replication quotas use dot separated, so
> > it
> > > > will
> > > > > > be
> > > > > > > > > > >> consistent
> > > > > > > > > > >> > > with all properties except byte rate quotas.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Radai: #1 Request processing time rather than
> > request
> > > > rate
> > > > > > > were
> > > > > > > > > > chosen
> > > > > > > > > > >> > > because the time per request can vary
> significantly
> > > > > between
> > > > > > > > > requests
> > > > > > > > > > >> as
> > > > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > > > >> > > #2 Two separate quotas for heartbeats/regular
> > requests
> > > > > feel
> > > > > > > like
> > > > > > > > > > more
> > > > > > > > > > >> > > configuration and more metrics. Since most users
> > would
> > > > set
> > > > > > > > quotas
> > > > > > > > > > >> higher
> > > > > > > > > > >> > > than the expected usage and quotas are more of a
> > > safety
> > > > > > net, a
> > > > > > > > > > single
> > > > > > > > > > >> > quota
> > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > >> > >  #3 The number of requests in purgatory is limited
> > by
> > > > the
> > > > > > > number
> > > > > > > > > of
> > > > > > > > > > >> > active
> > > > > > > > > > >> > > connections since only one request per connection
> > will
> > > > be
> > > > > > > > > throttled
> > > > > > > > > > >> at a
> > > > > > > > > > >> > > time.
> > > > > > > > > > >> > > #4 As with byte rate quotas, to use the full
> > allocated
> > > > > > quotas,
> > > > > > > > > > >> > > clients/users would need to use partitions that
> are
> > > > > > > distributed
> > > > > > > > > > across
> > > > > > > > > > >> > the
> > > > > > > > > > >> > > cluster. The alternative of using cluster-wide
> > quotas
> > > > > > instead
> > > > > > > of
> > > > > > > > > > >> > per-broker
> > > > > > > > > > >> > > quotas would be far too complex to implement.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Dong : We currently have two ClientQuotaManagers
> for
> > > > quota
> > > > > > > types
> > > > > > > > > > Fetch
> > > > > > > > > > >> > and
> > > > > > > > > > >> > > Produce. A new one will be added for IOThread,
> which
> > > > > manages
> > > > > > > > > quotas
> > > > > > > > > > >> for
> > > > > > > > > > >> > I/O
> > > > > > > > > > >> > > thread utilization. This will not update the Fetch
> > or
> > > > > > Produce
> > > > > > > > > > >> queue-size,
> > > > > > > > > > >> > > but will have a separate metric for the
> > queue-size.  I
> > > > > > wasn't
> > > > > > > > > > >> planning to
> > > > > > > > > > >> > > add any additional metrics apart from the
> equivalent
> > > > ones
> > > > > > for
> > > > > > > > > > existing
> > > > > > > > > > >> > > quotas as part of this KIP. Ratio of byte-rate to
> > I/O
> > > > > thread
> > > > > > > > > > >> utilization
> > > > > > > > > > >> > > could be slightly misleading since it depends on
> the
> > > > > > sequence
> > > > > > > of
> > > > > > > > > > >> > requests.
> > > > > > > > > > >> > > But we can look into more metrics after the KIP is
> > > > > > implemented
> > > > > > > > if
> > > > > > > > > > >> > required.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > I think we need to limit the maximum delay since
> all
> > > > > > requests
> > > > > > > > are
> > > > > > > > > > >> > > throttled. If a client has a quota of 0.001 units
> > and
> > > a
> > > > > > single
> > > > > > > > > > request
> > > > > > > > > > >> > used
> > > > > > > > > > >> > > 50ms, we don't want to delay all requests from the
> > > > client
> > > > > by
> > > > > > > 50
> > > > > > > > > > >> seconds,
> > > > > > > > > > >> > > throwing the client out of all its consumer
> groups.
> > > The
> > > > > > issue
> > > > > > > is
> > > > > > > > > > only
> > > > > > > > > > >> if
> > > > > > > > > > >> > a
> > > > > > > > > > >> > > user is allocated a quota that is insufficient to
> > > > process
> > > > > > one
> > > > > > > > > large
> > > > > > > > > > >> > > request. The expectation is that the units
> allocated
> > > per
> > > > > > user
> > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > >> > much
> > > > > > > > > > >> > > higher than the time taken to process one request
> > and
> > > > the
> > > > > > > limit
> > > > > > > > > > should
> > > > > > > > > > >> > > seldom be applied. Agree this needs proper
> > > > documentation.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Regards,
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Rajini
> > > > > > > > > > >> > >
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > >> > wrote:
> > > > > > > > > > >> > >
> > > > > > > > > > >> > >> @jun: i wasnt concerned about tying up a request
> > > > > processing
> > > > > > > > > thread,
> > > > > > > > > > >> but
> > > > > > > > > > >> > >> IIUC the code does still read the entire request
> > out,
> > > > > which
> > > > > > > > might
> > > > > > > > > > >> add-up
> > > > > > > > > > >> > >> to
> > > > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > > > >> > >>
> > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> > > > > > > > lindong28@gmail.com>
> > > > > > > > > > >> wrote:
> > > > > > > > > > >> > >>
> > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > The current KIP says that the maximum delay
> will
> > be
> > > > > > reduced
> > > > > > > > to
> > > > > > > > > > >> window
> > > > > > > > > > >> > >> size
> > > > > > > > > > >> > >> > if it is larger than the window size. I have a
> > > > concern
> > > > > > with
> > > > > > > > > this:
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > 1) This essentially means that the user is
> > allowed
> > > to
> > > > > > > exceed
> > > > > > > > > > their
> > > > > > > > > > >> > quota
> > > > > > > > > > >> > >> > over a long period of time. Can you provide an
> > > upper
> > > > > > bound
> > > > > > > on
> > > > > > > > > > this
> > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > 2) What is the motivation for cap the maximum
> > delay
> > > > by
> > > > > > the
> > > > > > > > > window
> > > > > > > > > > >> > size?
> > > > > > > > > > >> > >> I
> > > > > > > > > > >> > >> > am wondering if there is better alternative to
> > > > address
> > > > > > the
> > > > > > > > > > problem.
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > 3) It means that the existing metric-related
> > config
> > > > > will
> > > > > > > > have a
> > > > > > > > > > >> more
> > > > > > > > > > >> > >> > directly impact on the mechanism of this
> > > > > > > io-thread-unit-based
> > > > > > > > > > >> quota.
> > > > > > > > > > >> > The
> > > > > > > > > > >> > >> > may be an important change depending on the
> > answer
> > > to
> > > > > 1)
> > > > > > > > above.
> > > > > > > > > > We
> > > > > > > > > > >> > >> probably
> > > > > > > > > > >> > >> > need to document this more explicitly.
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > Dong
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > >> > wrote:
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > Yeah you are right. I thought it wasn't
> because
> > > at
> > > > > > > LinkedIn
> > > > > > > > > it
> > > > > > > > > > >> will
> > > > > > > > > > >> > be
> > > > > > > > > > >> > >> > too
> > > > > > > > > > >> > >> > > much pressure on inGraph to expose those
> > > > per-clientId
> > > > > > > > metrics
> > > > > > > > > > so
> > > > > > > > > > >> we
> > > > > > > > > > >> > >> ended
> > > > > > > > > > >> > >> > > up printing them periodically to local log.
> > Never
> > > > > mind
> > > > > > if
> > > > > > > > it
> > > > > > > > > is
> > > > > > > > > > >> not
> > > > > > > > > > >> > a
> > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > - I agree with Jay that we probably don't
> want
> > to
> > > > > add a
> > > > > > > new
> > > > > > > > > > field
> > > > > > > > > > >> > for
> > > > > > > > > > >> > >> > > every quota ProduceResponse or FetchResponse.
> > Is
> > > > > there
> > > > > > > any
> > > > > > > > > > >> use-case
> > > > > > > > > > >> > >> for
> > > > > > > > > > >> > >> > > having separate throttle-time fields for
> > > > > > byte-rate-quota
> > > > > > > > and
> > > > > > > > > > >> > >> > > io-thread-unit-quota? You probably need to
> > > document
> > > > > > this
> > > > > > > as
> > > > > > > > > > >> > interface
> > > > > > > > > > >> > >> > > change if you plan to add new field in any
> > > request.
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > - I don't think IOThread belongs to
> quotaType.
> > > The
> > > > > > > existing
> > > > > > > > > > quota
> > > > > > > > > > >> > >> types
> > > > > > > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > > > > > > n/FollowerReplication)
> > > > > > > > > > >> identify
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > > type of request that are throttled, not the
> > quota
> > > > > > > mechanism
> > > > > > > > > > that
> > > > > > > > > > >> is
> > > > > > > > > > >> > >> > applied.
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > - If a request is throttled due to this
> > > > > > > > io-thread-unit-based
> > > > > > > > > > >> quota,
> > > > > > > > > > >> > is
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > > existing queue-size metric in
> > ClientQuotaManager
> > > > > > > > incremented?
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > - In the interest of providing guide line for
> > > admin
> > > > > to
> > > > > > > > decide
> > > > > > > > > > >> > >> > > io-thread-unit-based quota and for user to
> > > > understand
> > > > > > its
> > > > > > > > > > impact
> > > > > > > > > > >> on
> > > > > > > > > > >> > >> their
> > > > > > > > > > >> > >> > > traffic, would it be useful to have a metric
> > that
> > > > > shows
> > > > > > > the
> > > > > > > > > > >> overall
> > > > > > > > > > >> > >> > > byte-rate per io-thread-unit? Can we also
> show
> > > > this a
> > > > > > > > > > >> per-clientId
> > > > > > > > > > >> > >> > metric?
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <
> > > > > > > jun@confluent.io
> > > > > > > > >
> > > > > > > > > > >> wrote:
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> For #3, typically, an admin won't configure
> > more
> > > > io
> > > > > > > > threads
> > > > > > > > > > than
> > > > > > > > > > >> > CPU
> > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > >> > >> > >> but it's possible for an admin to start with
> > > fewer
> > > > > io
> > > > > > > > > threads
> > > > > > > > > > >> than
> > > > > > > > > > >> > >> cores
> > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> I think the throttleTime sensor on the
> broker
> > > > tells
> > > > > > the
> > > > > > > > > admin
> > > > > > > > > > >> > >> whether a
> > > > > > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> The reasoning for delaying the throttled
> > > requests
> > > > on
> > > > > > the
> > > > > > > > > > broker
> > > > > > > > > > >> > >> instead
> > > > > > > > > > >> > >> > of
> > > > > > > > > > >> > >> > >> returning an error immediately is that the
> > > latter
> > > > > has
> > > > > > no
> > > > > > > > way
> > > > > > > > > > to
> > > > > > > > > > >> > >> prevent
> > > > > > > > > > >> > >> > >> the
> > > > > > > > > > >> > >> > >> client from retrying immediately, which will
> > > make
> > > > > > things
> > > > > > > > > > worse.
> > > > > > > > > > >> The
> > > > > > > > > > >> > >> > >> delaying logic is based off a delay queue. A
> > > > > separate
> > > > > > > > > > expiration
> > > > > > > > > > >> > >> thread
> > > > > > > > > > >> > >> > >> just waits on the next to be expired
> request.
> > > So,
> > > > it
> > > > > > > > doesn't
> > > > > > > > > > tie
> > > > > > > > > > >> > up a
> > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael
> Juma <
> > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > >> >
> > > > > > > > > > >> > >> wrote:
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > >> > >> > >> >
> > > > > > > > > > >> > >> > >> > Regarding 1, I definitely like the
> > simplicity
> > > of
> > > > > > > > keeping a
> > > > > > > > > > >> single
> > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > >> > >> > >> > time field in the response. The downside
> is
> > > that
> > > > > the
> > > > > > > > > client
> > > > > > > > > > >> > metrics
> > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > >> > >> > >> >
> > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > `leader.imbalance.per.broker.
> > > > > > > > > > percentage`
> > > > > > > > > > >> > and
> > > > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > > > > > > >> > >> > >> >
> > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > >> > >> > >> >
> > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay
> Kreps <
> > > > > > > > > > jay@confluent.io>
> > > > > > > > > > >> > >> wrote:
> > > > > > > > > > >> > >> > >> >
> > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > >> > >> > >> > >    1. Isn't it the case that the
> > throttling
> > > > time
> > > > > > > > > response
> > > > > > > > > > >> field
> > > > > > > > > > >> > >> > should
> > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > >> > >> > >> > >    the total time your request was
> > throttled
> > > > > > > > > irrespective
> > > > > > > > > > of
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > >> > >> > >> > >    caused that. Limiting it to byte rate
> > > quota
> > > > > > > doesn't
> > > > > > > > > > make
> > > > > > > > > > >> > >> sense,
> > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > >> > >> > >> > >    I don't think we want to end up
> adding
> > > new
> > > > > > fields
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > >> > >> response
> > > > > > > > > > >> > >> > >> for
> > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > > > > > > >> > >> > >> > >    2. I don't think we should make this
> > > quota
> > > > > > > > > specifically
> > > > > > > > > > >> > about
> > > > > > > > > > >> > >> io
> > > > > > > > > > >> > >> > >> > >    threads. Once we introduce these
> quotas
> > > > > people
> > > > > > > set
> > > > > > > > > them
> > > > > > > > > > >> and
> > > > > > > > > > >> > >> > expect
> > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > >    be enforced (and if they aren't it
> may
> > > > cause
> > > > > an
> > > > > > > > > > outage).
> > > > > > > > > > >> As
> > > > > > > > > > >> > a
> > > > > > > > > > >> > >> > >> result
> > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > >> > >> > >> > >    are a bit more sensitive than normal
> > > > > configs, I
> > > > > > > > > think.
> > > > > > > > > > >> The
> > > > > > > > > > >> > >> > current
> > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > >> > >> > >> > >    pools seem like something of an
> > > > > implementation
> > > > > > > > detail
> > > > > > > > > > and
> > > > > > > > > > >> > not
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > >> > >> > >> > >    user-facing quotas should be involved
> > > > with. I
> > > > > > > think
> > > > > > > > > it
> > > > > > > > > > >> might
> > > > > > > > > > >> > >> be
> > > > > > > > > > >> > >> > >> better
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > >    make this a general request-time
> > throttle
> > > > > with
> > > > > > no
> > > > > > > > > > >> mention in
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > >> > >> > >> > >    about I/O threads and simply
> > acknowledge
> > > > the
> > > > > > > > current
> > > > > > > > > > >> > >> limitation
> > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > >> > >> > >> > >    may someday fix) in the docs that
> this
> > > > covers
> > > > > > > only
> > > > > > > > > the
> > > > > > > > > > >> time
> > > > > > > > > > >> > >> after
> > > > > > > > > > >> > >> > >> the
> > > > > > > > > > >> > >> > >> > >    thread is read off the network.
> > > > > > > > > > >> > >> > >> > >    3. As such I think the right
> interface
> > to
> > > > the
> > > > > > > user
> > > > > > > > > > would
> > > > > > > > > > >> be
> > > > > > > > > > >> > >> > >> something
> > > > > > > > > > >> > >> > >> > >    like percent_request_time and be in
> > > > > {0,...100}
> > > > > > or
> > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is
> > the
> > > > > > > > terminology
> > > > > > > > > we
> > > > > > > > > > >> used
> > > > > > > > > > >> > >> if
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > >> > >> > >> > >    is between 0 and 1 in the other
> > metrics,
> > > > > > right?)
> > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini
> > > > Sivaram
> > > > > <
> > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Guozhang : I have updated the section
> on
> > > > > > > > co-existence
> > > > > > > > > of
> > > > > > > > > > >> byte
> > > > > > > > > > >> > >> rate
> > > > > > > > > > >> > >> > >> and
> > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added much detail to
> the
> > > > > metrics
> > > > > > > and
> > > > > > > > > > >> sensors
> > > > > > > > > > >> > >> since
> > > > > > > > > > >> > >> > >> they
> > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > >> > >> > >> > > > going to be very similar to the
> existing
> > > > > metrics
> > > > > > > and
> > > > > > > > > > >> sensors.
> > > > > > > > > > >> > >> To
> > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > >> > >> > >> > > > confusion, I have now added more
> detail.
> > > All
> > > > > > > metrics
> > > > > > > > > are
> > > > > > > > > > >> in
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > >> group
> > > > > > > > > > >> > >> > >> > > > "quotaType" and all sensors have names
> > > > > starting
> > > > > > > with
> > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/
> > > > LeaderReplication/
> > > > > > > > > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > > > > > > > > >> > >> > >> > > > So there will be no reuse of existing
> > > > > > > > metrics/sensors.
> > > > > > > > > > The
> > > > > > > > > > >> > new
> > > > > > > > > > >> > >> > ones
> > > > > > > > > > >> > >> > >> for
> > > > > > > > > > >> > >> > >> > > > request processing time based
> throttling
> > > > will
> > > > > be
> > > > > > > > > > >> completely
> > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > >> > >> > >> > > > existing metrics/sensors, but will be
> > > > > consistent
> > > > > > > in
> > > > > > > > > > >> format.
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > The existing throttle_time_ms field in
> > > > > > > produce/fetch
> > > > > > > > > > >> > responses
> > > > > > > > > > >> > >> > will
> > > > > > > > > > >> > >> > >> not
> > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > >> > >> > >> > > > impacted by this KIP. That will
> continue
> > > to
> > > > > > return
> > > > > > > > > > >> byte-rate
> > > > > > > > > > >> > >> based
> > > > > > > > > > >> > >> > >> > > > throttling times. In addition, a new
> > field
> > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > >> > >> > >> > > > added to return request quota based
> > > > throttling
> > > > > > > > times.
> > > > > > > > > > >> These
> > > > > > > > > > >> > >> will
> > > > > > > > > > >> > >> > be
> > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > >> > >> > >> > > > as new metrics on the client-side.
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Since all metrics and sensors are
> > > different
> > > > > for
> > > > > > > each
> > > > > > > > > > type
> > > > > > > > > > >> of
> > > > > > > > > > >> > >> > quota,
> > > > > > > > > > >> > >> > >> I
> > > > > > > > > > >> > >> > >> > > > believe there is already sufficient
> > > metrics
> > > > to
> > > > > > > > monitor
> > > > > > > > > > >> > >> throttling
> > > > > > > > > > >> > >> > on
> > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > >> > >> > >> > > > client and broker side for each type
> of
> > > > > > > throttling.
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong
> > Lin
> > > <
> > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > I think it makes a lot of sense to
> use
> > > > > > > > > io_thread_units
> > > > > > > > > > >> as
> > > > > > > > > > >> > >> metric
> > > > > > > > > > >> > >> > >> to
> > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > >> > >> > >> > > > > user's traffic here. LGTM overall. I
> > > have
> > > > > some
> > > > > > > > > > questions
> > > > > > > > > > >> > >> > regarding
> > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > - Can you be more specific in the
> KIP
> > > what
> > > > > > > sensors
> > > > > > > > > > will
> > > > > > > > > > >> be
> > > > > > > > > > >> > >> > added?
> > > > > > > > > > >> > >> > >> For
> > > > > > > > > > >> > >> > >> > > > > example, it will be useful to
> specify
> > > the
> > > > > name
> > > > > > > and
> > > > > > > > > > >> > >> attributes of
> > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > - We currently have throttle-time
> and
> > > > > > queue-size
> > > > > > > > for
> > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > >> > >> > >> based
> > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > >> > >> > >> > > > > Are you going to have separate
> > > > throttle-time
> > > > > > and
> > > > > > > > > > >> queue-size
> > > > > > > > > > >> > >> for
> > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > >> > >> > >> > > > > throttled by io_thread_unit-based
> > quota,
> > > > or
> > > > > > will
> > > > > > > > > they
> > > > > > > > > > >> share
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > >> same
> > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > - Does the throttle-time in the
> > > > > > ProduceResponse
> > > > > > > > and
> > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > >> > >> > >> > > > > time due to io_thread_unit-based
> > quota?
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > - Currently kafka server doesn't not
> > > > provide
> > > > > > any
> > > > > > > > log
> > > > > > > > > > or
> > > > > > > > > > >> > >> metrics
> > > > > > > > > > >> > >> > >> that
> > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > >> > >> > >> > > > > whether any given clientId (or user)
> > is
> > > > > > > throttled.
> > > > > > > > > > This
> > > > > > > > > > >> is
> > > > > > > > > > >> > >> not
> > > > > > > > > > >> > >> > too
> > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > >> > >> > >> > > > > because we can still check the
> > > client-side
> > > > > > > > byte-rate
> > > > > > > > > > >> metric
> > > > > > > > > > >> > >> to
> > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > >> > >> > >> > > > > whether a given client is throttled.
> > But
> > > > > with
> > > > > > > this
> > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > >> > >> > >> > > > > will be no way to validate whether a
> > > given
> > > > > > > client
> > > > > > > > is
> > > > > > > > > > >> slow
> > > > > > > > > > >> > >> > because
> > > > > > > > > > >> > >> > >> it
> > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > >> > >> > >> > > > > exceeded its io_thread_unit limit.
> It
> > is
> > > > > > > necessary
> > > > > > > > > for
> > > > > > > > > > >> user
> > > > > > > > > > >> > >> to
> > > > > > > > > > >> > >> > be
> > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > > > > know this information to figure how
> > > > whether
> > > > > > they
> > > > > > > > > have
> > > > > > > > > > >> > reached
> > > > > > > > > > >> > >> > >> there
> > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > >> > >> > >> > > > > limit. How about we add log4j log on
> > the
> > > > > > server
> > > > > > > > side
> > > > > > > > > > to
> > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > byte-rate-throttle-time,
> > > > > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > >> > >> > >> > > > > that kafka administrator can figure
> > > those
> > > > > > users
> > > > > > > > that
> > > > > > > > > > >> have
> > > > > > > > > > >> > >> > reached
> > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM,
> > > Guozhang
> > > > > > Wang <
> > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > > Made a pass over the doc, overall
> > LGTM
> > > > > > except
> > > > > > > a
> > > > > > > > > > minor
> > > > > > > > > > >> > >> comment
> > > > > > > > > > >> > >> > on
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > > Stated as "Request processing time
> > > > > > throttling
> > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > >> > >> applied
> > > > > > > > > > >> > >> > on
> > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > >> > >> > >> > > > > > necessary." I thought that it
> meant
> > > the
> > > > > > > request
> > > > > > > > > > >> > processing
> > > > > > > > > > >> > >> > time
> > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > >> > >> > >> > > > > > is applied first, but continue
> > > reading I
> > > > > > found
> > > > > > > > it
> > > > > > > > > > >> > actually
> > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > >> > >> > >> > > > > > produce / fetch byte rate
> throttling
> > > > > first.
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > > Also the last sentence "The
> > remaining
> > > > > delay
> > > > > > if
> > > > > > > > any
> > > > > > > > > > is
> > > > > > > > > > >> > >> applied
> > > > > > > > > > >> > >> > to
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > > response." is a bit confusing to
> me.
> > > > Maybe
> > > > > > > > > rewording
> > > > > > > > > > >> it a
> > > > > > > > > > >> > >> bit?
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM,
> Jun
> > > > Rao <
> > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > Thanks for the updated KIP. The
> > > latest
> > > > > > > > proposal
> > > > > > > > > > >> looks
> > > > > > > > > > >> > >> good
> > > > > > > > > > >> > >> > to
> > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM,
> > > > Rajini
> > > > > > > > Sivaram
> > > > > > > > > <
> > > > > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > Thank you for the feedback.
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > 1. I have updated the KIP to
> use
> > > > > > absolute
> > > > > > > > > units
> > > > > > > > > > >> > >> instead of
> > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > >> > >> > >> > > > > > > > property is called*
> > > io_thread_units*
> > > > > to
> > > > > > > > align
> > > > > > > > > > with
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > >> > >> > >> > > > > > > > property *num.io.threads*.
> When
> > we
> > > > > > > implement
> > > > > > > > > > >> network
> > > > > > > > > > >> > >> > thread
> > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > >> > >> > >> > > > > > > > quotas, we can add another
> > > property
> > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is
> already
> > > > > listed
> > > > > > > > under
> > > > > > > > > > the
> > > > > > > > > > >> > >> exempt
> > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > >> > >> > >> > > > > > > > you mean a different request
> > that
> > > > > needs
> > > > > > to
> > > > > > > > be
> > > > > > > > > > >> added?
> > > > > > > > > > >> > >> The
> > > > > > > > > > >> > >> > >> four
> > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > >> > >> > >> > > > > > > > currently exempt in the KIP
> are
> > > > > > > StopReplica,
> > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and
> UpdateMetadata.
> > > > These
> > > > > > are
> > > > > > > > > > >> controlled
> > > > > > > > > > >> > >> > using
> > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy to exclude
> > and
> > > > only
> > > > > > > > > throttle
> > > > > > > > > > if
> > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > >> > >> > >> > > > > > > > sure if there are other
> requests
> > > > used
> > > > > > only
> > > > > > > > for
> > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > 3. I was thinking the smallest
> > > > change
> > > > > > > would
> > > > > > > > be
> > > > > > > > > > to
> > > > > > > > > > >> > >> replace
> > > > > > > > > > >> > >> > >> all
> > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > *requestChannel.sendResponse()
> *
> > > > with
> > > > > a
> > > > > > > > local
> > > > > > > > > > >> method
> > > > > > > > > > >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()*
> > that
> > > > > does
> > > > > > > the
> > > > > > > > > > >> > throttling
> > > > > > > > > > >> > >> if
> > > > > > > > > > >> > >> > >> any
> > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > >> > >> > >> > > > > > > > response. If we throttle first
> > in
> > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > >> > >> > >> > > > > > > > within the method handling the
> > > > request
> > > > > > > will
> > > > > > > > > not
> > > > > > > > > > be
> > > > > > > > > > >> > >> > recorded
> > > > > > > > > > >> > >> > >> or
> > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > >> > >> > >> > > > > > > > throttling. We can look into
> > this
> > > > > again
> > > > > > > when
> > > > > > > > > the
> > > > > > > > > > >> PR
> > > > > > > > > > >> > is
> > > > > > > > > > >> > >> > ready
> > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55
> PM,
> > > > Roger
> > > > > > > > Hoover
> > > > > > > > > <
> > > > > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > Great to see this KIP and
> the
> > > > > > excellent
> > > > > > > > > > >> discussion.
> > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's suggestion
> makes
> > > > sense.
> > > > > > If
> > > > > > > > my
> > > > > > > > > > >> > >> application
> > > > > > > > > > >> > >> > is
> > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > >> > >> > >> > > > > > > > > request handler unit, then
> > it's
> > > as
> > > > > if
> > > > > > I
> > > > > > > > > have a
> > > > > > > > > > >> > Kafka
> > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > >> > >> > >> > > > > > > > > request handler thread
> > dedicated
> > > > to
> > > > > > me.
> > > > > > > > > > That's
> > > > > > > > > > >> the
> > > > > > > > > > >> > >> > most I
> > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > >> > >> > >> > > > > > > > > least.  That allocation
> > doesn't
> > > > > change
> > > > > > > > even
> > > > > > > > > if
> > > > > > > > > > >> an
> > > > > > > > > > >> > >> admin
> > > > > > > > > > >> > >> > >> later
> > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > size of the request thread
> > pool
> > > on
> > > > > the
> > > > > > > > > broker.
> > > > > > > > > > >> > It's
> > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > >> > >> > >> > > > > > > > > abstraction that VMs and
> > > > containers
> > > > > > get
> > > > > > > > from
> > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > >> > >> > >> or
> > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > >> > >> > >> > > > > > > > > While different client
> access
> > > > > patterns
> > > > > > > can
> > > > > > > > > use
> > > > > > > > > > >> > wildly
> > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > >> > >> > >> > > > > > > > > request thread resources per
> > > > > request,
> > > > > > a
> > > > > > > > > given
> > > > > > > > > > >> > >> > application
> > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > >> > >> > >> > > > > > > > > have a stable access pattern
> > and
> > > > can
> > > > > > > > figure
> > > > > > > > > > out
> > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > >> > >> > >> > > > > > > > > "request thread units" it
> > needs
> > > to
> > > > > > meet
> > > > > > > > it's
> > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53
> > AM,
> > > > Jun
> > > > > > > Rao <
> > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for the updated
> KIP.
> > A
> > > > few
> > > > > > more
> > > > > > > > > > >> comments.
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> > > > > request_time_percent
> > > > > > > is
> > > > > > > > > that
> > > > > > > > > > >> it's
> > > > > > > > > > >> > >> not
> > > > > > > > > > >> > >> > an
> > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > >> > >> > >> > > > > > > > > > Let's say you give a user
> a
> > > 10%
> > > > > > limit.
> > > > > > > > If
> > > > > > > > > > the
> > > > > > > > > > >> > admin
> > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > >> > >> > >> > > > > > > > > > request handler threads,
> > that
> > > > user
> > > > > > now
> > > > > > > > > > >> actually
> > > > > > > > > > >> > has
> > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > >> > >> > >> > > > > > > > > > capacity. This may confuse
> > > > people
> > > > > a
> > > > > > > bit.
> > > > > > > > > So,
> > > > > > > > > > >> > >> perhaps
> > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > > > based on an absolute
> request
> > > > > thread
> > > > > > > unit
> > > > > > > > > is
> > > > > > > > > > >> > better.
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> ControlledShutdownRequest
> > > is
> > > > > also
> > > > > > > an
> > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > be excluded from
> throttling.
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > 3. Implementation wise, I
> am
> > > > > > wondering
> > > > > > > > if
> > > > > > > > > > it's
> > > > > > > > > > >> > >> simpler
> > > > > > > > > > >> > >> > >> to
> > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > time throttling first in
> > > > > > > > > KafkaApis.handle().
> > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > >> > >> > >> we
> > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > >> > >> > >> > > > > > > > > > the throttling logic in
> each
> > > > type
> > > > > of
> > > > > > > > > > request.
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at
> 5:58
> > > AM,
> > > > > > > Rajini
> > > > > > > > > > >> Sivaram <
> > > > > > > > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank you for the
> review.
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > I have reverted to the
> > > > original
> > > > > > KIP
> > > > > > > > that
> > > > > > > > > > >> > >> throttles
> > > > > > > > > > >> > >> > >> based
> > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > >> > >> > >> > > > > > > > > > > utilization. At the
> > moment,
> > > it
> > > > > > uses
> > > > > > > > > > >> percentage,
> > > > > > > > > > >> > >> but
> > > > > > > > > > >> > >> > I
> > > > > > > > > > >> > >> > >> am
> > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > a fraction (out of 1
> > instead
> > > > of
> > > > > > 100)
> > > > > > > > if
> > > > > > > > > > >> > >> required. I
> > > > > > > > > > >> > >> > >> have
> > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > >> > >> > >> > > > > > > > > > > from this discussion to
> > the
> > > > KIP.
> > > > > > > Also
> > > > > > > > > > added
> > > > > > > > > > >> a
> > > > > > > > > > >> > >> > "Future
> > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > address network thread
> > > > > > utilization.
> > > > > > > > The
> > > > > > > > > > >> > >> > configuration
> > > > > > > > > > >> > >> > >> is
> > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > >> > >> > >> > > > > > > > > > > "request_time_percent"
> > with
> > > > the
> > > > > > > > > > expectation
> > > > > > > > > > >> > that
> > > > > > > > > > >> > >> it
> > > > > > > > > > >> > >> > >> can
> > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > limit for network thread
> > > > > > utilization
> > > > > > > > > when
> > > > > > > > > > >> that
> > > > > > > > > > >> > is
> > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > > > users have to set only
> one
> > > > > config
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > >> two
> > > > > > > > > > >> > and
> > > > > > > > > > >> > >> > not
> > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > >> > >> > >> > > > > > > > > > > the internal
> distribution
> > of
> > > > the
> > > > > > > work
> > > > > > > > > > >> between
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > two
> > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at
> > > 12:23
> > > > > AM,
> > > > > > > Jun
> > > > > > > > > Rao
> > > > > > > > > > <
> > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for the
> proposal.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > The benefit of using
> the
> > > > > request
> > > > > > > > > > >> processing
> > > > > > > > > > >> > >> time
> > > > > > > > > > >> > >> > >> over
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > exactly what people
> have
> > > > > said. I
> > > > > > > > will
> > > > > > > > > > just
> > > > > > > > > > >> > >> expand
> > > > > > > > > > >> > >> > >> that
> > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > following case. The
> > > producer
> > > > > > > sends a
> > > > > > > > > > >> produce
> > > > > > > > > > >> > >> > request
> > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > but compressed to
> 100KB
> > > with
> > > > > > gzip.
> > > > > > > > The
> > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > broker could take
> 10-15
> > > > > seconds,
> > > > > > > > > during
> > > > > > > > > > >> which
> > > > > > > > > > >> > >> > time,
> > > > > > > > > > >> > >> > >> a
> > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > thread is completely
> > > > blocked.
> > > > > In
> > > > > > > > this
> > > > > > > > > > >> case,
> > > > > > > > > > >> > >> > neither
> > > > > > > > > > >> > >> > >> the
> > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request rate quota
> > may
> > > > be
> > > > > > > > > effective
> > > > > > > > > > in
> > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > another case. A
> consumer
> > > > group
> > > > > > > > starts
> > > > > > > > > > >> with 10
> > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > switches to 20
> > instances.
> > > > The
> > > > > > > > request
> > > > > > > > > > rate
> > > > > > > > > > >> > will
> > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > actually load on the
> > > broker
> > > > > may
> > > > > > > not
> > > > > > > > > > double
> > > > > > > > > > >> > >> since
> > > > > > > > > > >> > >> > >> each
> > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > contains half of the
> > > > > partitions.
> > > > > > > > > Request
> > > > > > > > > > >> rate
> > > > > > > > > > >> > >> > quota
> > > > > > > > > > >> > >> > >> may
> > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure in this
> case.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > What we really want is
> > to
> > > be
> > > > > > able
> > > > > > > to
> > > > > > > > > > >> prevent
> > > > > > > > > > >> > a
> > > > > > > > > > >> > >> > >> client
> > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > of the server side
> > > > resources.
> > > > > In
> > > > > > > > this
> > > > > > > > > > >> > >> particular
> > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > capacity of the
> request
> > > > > handler
> > > > > > > > > > threads. I
> > > > > > > > > > >> > >> agree
> > > > > > > > > > >> > >> > >> that
> > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > intuitive for the
> users
> > to
> > > > > > > determine
> > > > > > > > > how
> > > > > > > > > > >> to
> > > > > > > > > > >> > set
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > this is not completely
> > new
> > > > and
> > > > > > has
> > > > > > > > > been
> > > > > > > > > > >> done
> > > > > > > > > > >> > in
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > already. For example,
> > > Linux
> > > > > > > cgroup (
> > > > > > > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > Resource_Management_Guide/sec-
> > > > > > > > > cpu.html)
> > > > > > > > > > >> has
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > which specifies the
> > total
> > > > > amount
> > > > > > > of
> > > > > > > > > time
> > > > > > > > > > >> in
> > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup can
> > run
> > > > > > during a
> > > > > > > > one
> > > > > > > > > > >> second
> > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > model the request
> > handler
> > > > > > threads
> > > > > > > > in a
> > > > > > > > > > >> > similar
> > > > > > > > > > >> > >> > way.
> > > > > > > > > > >> > >> > >> For
> > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > request handler thread
> > can
> > > > be
> > > > > 1
> > > > > > > > > request
> > > > > > > > > > >> > handler
> > > > > > > > > > >> > >> > unit
> > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure a limit on
> how
> > > > many
> > > > > > > units
> > > > > > > > > (say
> > > > > > > > > > >> > 0.01)
> > > > > > > > > > >> > >> a
> > > > > > > > > > >> > >> > >> client
> > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Regarding not
> throttling
> > > the
> > > > > > > > internal
> > > > > > > > > > >> broker
> > > > > > > > > > >> > to
> > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > do that.
> Alternatively,
> > we
> > > > > could
> > > > > > > > just
> > > > > > > > > > let
> > > > > > > > > > >> the
> > > > > > > > > > >> > >> > admin
> > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > for the kafka user (it
> > may
> > > > not
> > > > > > be
> > > > > > > > able
> > > > > > > > > > to
> > > > > > > > > > >> do
> > > > > > > > > > >> > >> that
> > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we want to be
> > able
> > > > to
> > > > > > > > protect
> > > > > > > > > > the
> > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool too. The
> difficult
> > is
> > > > > > mostly
> > > > > > > > what
> > > > > > > > > > >> Rajini
> > > > > > > > > > >> > >> > said:
> > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttling the
> requests
> > is
> > > > > > through
> > > > > > > > > > >> Purgatory
> > > > > > > > > > >> > >> and
> > > > > > > > > > >> > >> > we
> > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > through how to
> integrate
> > > > that
> > > > > > into
> > > > > > > > the
> > > > > > > > > > >> > network
> > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > layer, currently we
> know
> > > the
> > > > > > user,
> > > > > > > > but
> > > > > > > > > > not
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit tricky to
> > > > throttle
> > > > > > > based
> > > > > > > > on
> > > > > > > > > > >> > clientId
> > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota can already
> > protect
> > > > the
> > > > > > > > network
> > > > > > > > > > >> thread
> > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > requests. So, if we
> > can't
> > > > > figure
> > > > > > > out
> > > > > > > > > > this
> > > > > > > > > > >> > part
> > > > > > > > > > >> > >> > right
> > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request handling
> > > threads
> > > > > for
> > > > > > > > this
> > > > > > > > > > KIP
> > > > > > > > > > >> is
> > > > > > > > > > >> > >> > still a
> > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017
> at
> > > 4:27
> > > > > AM,
> > > > > > > > > Rajini
> > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank you all for
> the
> > > > > > feedback.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have removed
> > > > > exemption
> > > > > > > for
> > > > > > > > > > >> consumer
> > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > protecting the
> cluster
> > > is
> > > > > more
> > > > > > > > > > important
> > > > > > > > > > >> > than
> > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have retained the
> > > > exemption
> > > > > > for
> > > > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled only if
> > > > > > authorization
> > > > > > > > > fails
> > > > > > > > > > >> (so
> > > > > > > > > > >> > >> can't
> > > > > > > > > > >> > >> > be
> > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a secure cluster,
> but
> > > > allows
> > > > > > > > > > >> inter-broker
> > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I will wait another
> > day
> > > to
> > > > > see
> > > > > > > if
> > > > > > > > > > these
> > > > > > > > > > >> is
> > > > > > > > > > >> > >> any
> > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > request processing
> > time
> > > > (as
> > > > > > > > opposed
> > > > > > > > > to
> > > > > > > > > > >> > >> request
> > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > objections, I will
> > > revert
> > > > to
> > > > > > the
> > > > > > > > > > >> original
> > > > > > > > > > >> > >> > proposal
> > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > The original
> proposal
> > > was
> > > > > only
> > > > > > > > > > including
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > time
> > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handler threads
> (that
> > > made
> > > > > > > > > calculation
> > > > > > > > > > >> > >> easy). I
> > > > > > > > > > >> > >> > >> think
> > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > include the time
> spent
> > > in
> > > > > the
> > > > > > > > > network
> > > > > > > > > > >> > >> threads as
> > > > > > > > > > >> > >> > >> well
> > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant. As Jay
> > > > pointed
> > > > > > out,
> > > > > > > > it
> > > > > > > > > is
> > > > > > > > > > >> more
> > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > total available CPU
> > time
> > > > and
> > > > > > > > convert
> > > > > > > > > > to
> > > > > > > > > > >> a
> > > > > > > > > > >> > >> ratio
> > > > > > > > > > >> > >> > >> when
> > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and *n* network
> > threads.
> > > > > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we want, but it can
> be
> > > > very
> > > > > > > > > expensive
> > > > > > > > > > on
> > > > > > > > > > >> > some
> > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang have
> pointed
> > > out,
> > > > > we
> > > > > > do
> > > > > > > > > have
> > > > > > > > > > >> > several
> > > > > > > > > > >> > >> > time
> > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > generating metrics
> > that
> > > we
> > > > > > could
> > > > > > > > > use,
> > > > > > > > > > >> > though
> > > > > > > > > > >> > >> we
> > > > > > > > > > >> > >> > >> might
> > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime() instead
> of
> > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > >> > >> since
> > > > > > > > > > >> > >> > >> some
> > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > small requests may
> be
> > <
> > > > 1ms.
> > > > > > But
> > > > > > > > > > rather
> > > > > > > > > > >> > than
> > > > > > > > > > >> > >> add
> > > > > > > > > > >> > >> > >> up
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thread and network
> > > thread,
> > > > > > > > wouldn't
> > > > > > > > > it
> > > > > > > > > > >> be
> > > > > > > > > > >> > >> better
> > > > > > > > > > >> > >> > >> to
> > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on each thread into
> a
> > > > > separate
> > > > > > > > > ratio?
> > > > > > > > > > >> UserA
> > > > > > > > > > >> > >> has
> > > > > > > > > > >> > >> > a
> > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we take that to mean
> > > that
> > > > > > UserA
> > > > > > > > can
> > > > > > > > > > use
> > > > > > > > > > >> 5%
> > > > > > > > > > >> > of
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of the time
> on
> > > I/O
> > > > > > > threads?
> > > > > > > > > If
> > > > > > > > > > >> > either
> > > > > > > > > > >> > >> is
> > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled - it would
> > > mean
> > > > > > > > > maintaining
> > > > > > > > > > >> two
> > > > > > > > > > >> > >> sets
> > > > > > > > > > >> > >> > of
> > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > durations, but would
> > > > result
> > > > > in
> > > > > > > > more
> > > > > > > > > > >> > >> meaningful
> > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quota limits (UserA
> > has
> > > 5%
> > > > > of
> > > > > > > > > request
> > > > > > > > > > >> > threads
> > > > > > > > > > >> > >> > and
> > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > but that seems
> > > unnecessary
> > > > > and
> > > > > > > > > harder
> > > > > > > > > > to
> > > > > > > > > > >> > >> explain
> > > > > > > > > > >> > >> > >> to
> > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back to why and how
> > > quotas
> > > > > are
> > > > > > > > > applied
> > > > > > > > > > >> to
> > > > > > > > > > >> > >> > network
> > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In the case of
> > fetch,
> > > > > the
> > > > > > > time
> > > > > > > > > > >> spent in
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant and I
> can
> > > see
> > > > > the
> > > > > > > need
> > > > > > > > > to
> > > > > > > > > > >> > include
> > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > requests where the
> > > network
> > > > > > > thread
> > > > > > > > > > >> > >> utilization is
> > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of fetch, request
> > > handler
> > > > > > thread
> > > > > > > > > > >> > utilization
> > > > > > > > > > >> > >> > would
> > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > high request rate,
> low
> > > > data
> > > > > > > volume
> > > > > > > > > and
> > > > > > > > > > >> > fetch
> > > > > > > > > > >> > >> > byte
> > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > clients with high
> data
> > > > > volume.
> > > > > > > > > Network
> > > > > > > > > > >> > thread
> > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > proportional to the
> > data
> > > > > > > volume. I
> > > > > > > > > am
> > > > > > > > > > >> > >> wondering
> > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > based on network
> > thread
> > > > > > > > utilization
> > > > > > > > > or
> > > > > > > > > > >> > >> whether
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At the moment, we
> > > > record
> > > > > > and
> > > > > > > > > check
> > > > > > > > > > >> for
> > > > > > > > > > >> > >> quota
> > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a quota is
> > violated,
> > > > the
> > > > > > > > response
> > > > > > > > > > is
> > > > > > > > > > >> > >> delayed.
> > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk reads for
> fetches
> > > > > > happening
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > >> > >> network
> > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delay a response
> after
> > > the
> > > > > > disk
> > > > > > > > > reads.
> > > > > > > > > > >> We
> > > > > > > > > > >> > >> could
> > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > the network thread
> > when
> > > > the
> > > > > > > > response
> > > > > > > > > > is
> > > > > > > > > > >> > >> complete
> > > > > > > > > > >> > >> > >> and
> > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling a
> subsequent
> > > > > request
> > > > > > > > > > (separate
> > > > > > > > > > >> out
> > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling in the case
> > of
> > > > > > network
> > > > > > > > > thread
> > > > > > > > > > >> > >> > overload).
> > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017
> > at
> > > > 2:58
> > > > > > AM,
> > > > > > > > > > Becket
> > > > > > > > > > >> > Qin <
> > > > > > > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that
> > > > > enforcing
> > > > > > > the
> > > > > > > > > CPU
> > > > > > > > > > >> time
> > > > > > > > > > >> > >> is a
> > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > that maybe we can
> > use
> > > > the
> > > > > > > > existing
> > > > > > > > > > >> > request
> > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > very detailed so
> we
> > > can
> > > > > > > probably
> > > > > > > > > see
> > > > > > > > > > >> the
> > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > something like
> > > > > (total_time -
> > > > > > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I agree with
> > Guozhang
> > > > that
> > > > > > > when
> > > > > > > > a
> > > > > > > > > > >> user is
> > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > need to see if
> > > anything
> > > > > has
> > > > > > > went
> > > > > > > > > > wrong
> > > > > > > > > > >> > >> first,
> > > > > > > > > > >> > >> > >> and
> > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > behaving and just
> > need
> > > > > more
> > > > > > > > > > >> resources, we
> > > > > > > > > > >> > >> will
> > > > > > > > > > >> > >> > >> have
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for them. It is
> true
> > > > that
> > > > > > > > > > >> pre-allocating
> > > > > > > > > > >> > >> CPU
> > > > > > > > > > >> > >> > >> time
> > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > users is
> difficult.
> > So
> > > > in
> > > > > > > > practice
> > > > > > > > > > it
> > > > > > > > > > >> > would
> > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a relative high
> > > > protective
> > > > > > CPU
> > > > > > > > > time
> > > > > > > > > > >> quota
> > > > > > > > > > >> > >> for
> > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for some
> individual
> > > > > clients
> > > > > > on
> > > > > > > > > > demand.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket)
> > Qin
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20,
> 2017
> > > at
> > > > > 5:48
> > > > > > > PM,
> > > > > > > > > > >> Guozhang
> > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > This is a great
> > > > > proposal,
> > > > > > > glad
> > > > > > > > > to
> > > > > > > > > > >> see
> > > > > > > > > > >> > it
> > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am inclined to
> > the
> > > > CPU
> > > > > > > > > > >> throttling, or
> > > > > > > > > > >> > >> more
> > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio instead of
> > the
> > > > > > request
> > > > > > > > > rate
> > > > > > > > > > >> > >> throttling
> > > > > > > > > > >> > >> > >> as
> > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > summed my
> > rationales
> > > > > > above,
> > > > > > > > and
> > > > > > > > > > one
> > > > > > > > > > >> > >> thing to
> > > > > > > > > > >> > >> > >> add
> > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > has a good
> support
> > > for
> > > > > > both
> > > > > > > > > > >> "protecting
> > > > > > > > > > >> > >> > >> against
> > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a
> > cluster
> > > > for
> > > > > > > > > > >> multi-tenancy
> > > > > > > > > > >> > >> > usage":
> > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > explain this to
> > the
> > > > end
> > > > > > > > users, I
> > > > > > > > > > >> find
> > > > > > > > > > >> > it
> > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > request rate
> since
> > > as
> > > > > > > > mentioned
> > > > > > > > > > >> above,
> > > > > > > > > > >> > >> > >> different
> > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > different
> "cost",
> > > and
> > > > > > Kafka
> > > > > > > > > today
> > > > > > > > > > >> > already
> > > > > > > > > > >> > >> > have
> > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > (produce, fetch,
> > > > admin,
> > > > > > > > > metadata,
> > > > > > > > > > >> etc),
> > > > > > > > > > >> > >> > >> because
> > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttling may
> not
> > > be
> > > > as
> > > > > > > > > effective
> > > > > > > > > > >> > >> unless it
> > > > > > > > > > >> > >> > >> is
> > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > Regarding to
> user
> > > > > > reactions
> > > > > > > > when
> > > > > > > > > > >> they
> > > > > > > > > > >> > are
> > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > case-by-case,
> and
> > > need
> > > > > to
> > > > > > be
> > > > > > > > > > >> > discovered /
> > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > metrics. So in
> > other
> > > > > words
> > > > > > > > users
> > > > > > > > > > >> would
> > > > > > > > > > >> > >> not
> > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > information by
> > > simply
> > > > > > being
> > > > > > > > told
> > > > > > > > > > >> "hey,
> > > > > > > > > > >> > >> you
> > > > > > > > > > >> > >> > are
> > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > what throttling
> > > does;
> > > > > they
> > > > > > > > need
> > > > > > > > > to
> > > > > > > > > > >> > take a
> > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttled
> probably
> > > > > because
> > > > > > > of
> > > > > > > > > ..",
> > > > > > > > > > >> > which
> > > > > > > > > > >> > >> is
> > > > > > > > > > >> > >> > by
> > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > values: e.g.
> > whether
> > > > I'm
> > > > > > > > > > bombarding
> > > > > > > > > > >> the
> > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >>
> > > > > > > > > > > ...
> > > > > > > > > > >
> > > > > > > > > > > [Message clipped]
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> *Todd Palino*
> Staff Site Reliability Engineer
> Data Infrastructure Streaming
>
>
>
> linkedin.com/in/toddpalino
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Hi Todd,

Thank you for the review.

For SSL, the case that is not covered is Scenario 6 in the KIP that Ismael
pointed out. For clusters with only SSL or PLAINTEXT, byte rate quotas work
well, but for clusters with both SSL and PLAINTEXT, network thread
utilization also needs to be taken into account.

For percentage used in quota configuration, looks like opinion is still
split between an overall percentage and per-thread percentage. Will wait
for Jun to respond before updating the KIP either way.

Regards,

Rajini

On Wed, Mar 8, 2017 at 12:45 AM, Todd Palino <tp...@gmail.com> wrote:

> I’ve been following this one on and off, and overall it sounds good to me.
>
> - The SSL question is a good one. However, that type of overhead should be
> proportional to the bytes rate, so I think that a bytes rate quota would
> still be a suitable way to address it.
>
> - I think it’s better to make the quota percentage of total thread pool
> capacity, and not percentage of an individual thread. That way you don’t
> have to adjust it when you adjust thread counts (tuning, hardware changes,
> etc.)
>
>
> -Todd
>
>
>
> On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <be...@gmail.com> wrote:
>
> > I see. Good point about SSL.
> >
> > I just asked Todd to take a look.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Jiangjie,
> > >
> > > Yes, I agree that byte rate already protects the network threads
> > > indirectly. I am not sure if byte rate fully captures the CPU overhead
> in
> > > network due to SSL. So, at the high level, we can use request time
> limit
> > to
> > > protect CPU and use byte rate to protect storage and network.
> > >
> > > Also, do you think you can get Todd to comment on this KIP?
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <be...@gmail.com>
> > wrote:
> > >
> > > > Hi Rajini/Jun,
> > > >
> > > > The percentage based reasoning sounds good.
> > > > One thing I am wondering is that if we assume the network thread are
> > just
> > > > doing the network IO, can we say bytes rate quota is already sort of
> > > > network threads quota?
> > > > If we take network threads into the consideration here, would that be
> > > > somewhat overlapping with the bytes rate quota?
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> > rajinisivaram@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Jun,
> > > > >
> > > > > Thank you for the explanation, I hadn't realized you meant
> percentage
> > > of
> > > > > the total thread pool. If everyone is OK with Jun's suggestion, I
> > will
> > > > > update the KIP.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Rajini
> > > > >
> > > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Rajini,
> > > > > >
> > > > > > Let's take your example. Let's say a user sets the limit to 50%.
> I
> > am
> > > > not
> > > > > > sure if it's better to apply the same percentage separately to
> > > network
> > > > > and
> > > > > > io thread pool. For example, for produce requests, most of the
> time
> > > > will
> > > > > be
> > > > > > spent in the io threads whereas for fetch requests, most of the
> > time
> > > > will
> > > > > > be in the network threads. So, using the same percentage in both
> > > thread
> > > > > > pools means one of the pools' resource will be over allocated.
> > > > > >
> > > > > > An alternative way is to simply model network and io thread pool
> > > > > together.
> > > > > > If you get 10 io threads and 5 network threads, you get 1500%
> > request
> > > > > > processing power. A 50% limit means a total of 750% processing
> > power.
> > > > We
> > > > > > just add up the time a user request spent in either network or io
> > > > thread.
> > > > > > If that total exceeds 750% (doesn't matter whether it's spent
> more
> > in
> > > > > > network or io thread), the request will be throttled. This seems
> > more
> > > > > > general and is not sensitive to the current implementation detail
> > of
> > > > > having
> > > > > > a separate network and io thread pool. In the future, if the
> > > threading
> > > > > > model changes, the same concept of quota can still be applied.
> For
> > > now,
> > > > > > since it's a bit tricky to add the delay logic in the network
> > thread
> > > > > pool,
> > > > > > we could probably just do the delaying only in the io threads as
> > you
> > > > > > suggested earlier.
> > > > > >
> > > > > > There is still the orthogonal question of whether a quota of 50%
> is
> > > out
> > > > > of
> > > > > > 100% or 100% * #total processing threads. My feeling is that the
> > > latter
> > > > > is
> > > > > > slightly better based on my explanation earlier. The way to
> > describe
> > > > this
> > > > > > quota to the users can be "share of elapsed request processing
> time
> > > on
> > > > a
> > > > > > single CPU" (similar to top).
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Jun,
> > > > > > >
> > > > > > > Agree about the two scenarios.
> > > > > > >
> > > > > > > But still not sure about a single quota covering both network
> > > threads
> > > > > and
> > > > > > > I/O threads with per-thread quota. If there are 10 I/O threads
> > and
> > > 5
> > > > > > > network threads and I want to assign half the quota to userA,
> the
> > > > quota
> > > > > > > would be 750%. I imagine, internally, we would convert this to
> > 500%
> > > > for
> > > > > > I/O
> > > > > > > and 250% for network threads to allocate 50% of each pool.
> > > > > > >
> > > > > > > A couple of scenarios:
> > > > > > >
> > > > > > > 1. Admin adds 1 extra network thread. To retain 50%, admin
> needs
> > to
> > > > now
> > > > > > > allocate 800% for each user. Or increase the quota for a few
> > users.
> > > > To
> > > > > > me,
> > > > > > > it feels like admin needs to convert 50% to 800% and Kafka
> > > internally
> > > > > > needs
> > > > > > > to convert 800% to (500%, 300%). Everyone using just 50% feels
> a
> > > lot
> > > > > > > simpler.
> > > > > > >
> > > > > > > 2. We decide to add some other thread to this list. Admin needs
> > to
> > > > know
> > > > > > > exactly how many threads form the maximum quota. And we can be
> > > > changing
> > > > > > > this between broker versions as we add more to the list. Again
> a
> > > > single
> > > > > > > overall percent would be a lot simpler.
> > > > > > >
> > > > > > > There were others who were unconvinced by a single percent from
> > the
> > > > > > initial
> > > > > > > proposal and were happier with thread units similar to CPU
> units,
> > > so
> > > > I
> > > > > am
> > > > > > > ok with going with per-thread quotas (as units or percent).
> Just
> > > not
> > > > > sure
> > > > > > > it makes it easier for admin in all cases.
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > Rajini
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > > > > >
> > > > > > > > Hi, Rajini,
> > > > > > > >
> > > > > > > > Consider modeling as n * 100% unit. For 2), the question is
> > > what's
> > > > > > > causing
> > > > > > > > the I/O threads to be saturated. It's unlikely that all
> users'
> > > > > > > utilization
> > > > > > > > have increased at the same. A more likely case is that a few
> > > > isolated
> > > > > > > > users' utilization have increased. If so, after increasing
> the
> > > > number
> > > > > > of
> > > > > > > > threads, the admin just needs to adjust the quota for a few
> > > > isolated
> > > > > > > users,
> > > > > > > > which is expected and is less work.
> > > > > > > >
> > > > > > > > Consider modeling as 1 * 100% unit. For 1), all users' quota
> > need
> > > > to
> > > > > be
> > > > > > > > adjusted, which is unexpected and is more work.
> > > > > > > >
> > > > > > > > So, to me, the n * 100% model seems more convenient.
> > > > > > > >
> > > > > > > > As for future extension to cover network thread utilization,
> I
> > > was
> > > > > > > thinking
> > > > > > > > that one way is to simply model the capacity as (n + m) *
> 100%
> > > > unit,
> > > > > > > where
> > > > > > > > n and m are the number of network and i/o threads,
> > respectively.
> > > > > Then,
> > > > > > > for
> > > > > > > > each user, we can just add up the utilization in the network
> > and
> > > > the
> > > > > > i/o
> > > > > > > > thread. If we do this, we don't need a new type of quota.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > > > > rajinisivaram@gmail.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Jun,
> > > > > > > > >
> > > > > > > > > If we use request.percentage as the percentage used in a
> > single
> > > > I/O
> > > > > > > > thread,
> > > > > > > > > the total percentage being allocated will be
> num.io.threads *
> > > 100
> > > > > for
> > > > > > > I/O
> > > > > > > > > threads and num.network.threads * 100 for network threads.
> A
> > > > single
> > > > > > > quota
> > > > > > > > > covering the two as a percentage wouldn't quite work if you
> > > want
> > > > to
> > > > > > > > > allocate the same proportion in both cases. If we want to
> > treat
> > > > > > threads
> > > > > > > > as
> > > > > > > > > separate units, won't we need two quota configurations
> > > regardless
> > > > > of
> > > > > > > > > whether we use units or percentage? Perhaps I misunderstood
> > > your
> > > > > > > > > suggestion.
> > > > > > > > >
> > > > > > > > > I think there are two cases:
> > > > > > > > >
> > > > > > > > >    1. The use case that you mentioned where an admin is
> > adding
> > > > more
> > > > > > > users
> > > > > > > > >    and decides to add more I/O threads and expects to find
> > free
> > > > > quota
> > > > > > > to
> > > > > > > > >    allocate for new users.
> > > > > > > > >    2. Admin adds more I/O threads because the I/O threads
> are
> > > > > > saturated
> > > > > > > > and
> > > > > > > > >    there are cores available to allocate, even though the
> > > number
> > > > or
> > > > > > > > >    users/clients hasn't changed.
> > > > > > > > >
> > > > > > > > > If we allocated treated I/O threads as a single unit of
> 100%,
> > > all
> > > > > > user
> > > > > > > > > quotas need to be reallocated for 1). If we allocated I/O
> > > threads
> > > > > as
> > > > > > n
> > > > > > > > > units with n*100%, all user quotas need to be reallocated
> for
> > > 2),
> > > > > > > > otherwise
> > > > > > > > > some of the new threads may just not be used. Either way it
> > > > should
> > > > > be
> > > > > > > > easy
> > > > > > > > > to write a script to decrease/increase quotas by a multiple
> > for
> > > > all
> > > > > > > > users.
> > > > > > > > >
> > > > > > > > > So it really boils down to which quota unit is most
> intuitive
> > > in
> > > > > > terms
> > > > > > > of
> > > > > > > > > configuration. And from the discussion so far, it feels
> like
> > > > > opinion
> > > > > > is
> > > > > > > > > divided on whether quotas should be carved out of an
> absolute
> > > > 100%
> > > > > > (or
> > > > > > > 1
> > > > > > > > > unit) or be relative to the number of threads (n*100% or n
> > > > units).
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io>
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Another way to express an absolute limit is to use
> > > > > > > request.percentage,
> > > > > > > > > but
> > > > > > > > > > treat it as the percentage used in a single request
> > handling
> > > > > > thread.
> > > > > > > > For
> > > > > > > > > > now, the request handling threads can be just the io
> > threads.
> > > > In
> > > > > > the
> > > > > > > > > > future, they can cover the network threads as well. This
> is
> > > > > similar
> > > > > > > to
> > > > > > > > > how
> > > > > > > > > > top reports CPU usage and may be a bit easier for people
> to
> > > > > > > understand.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <
> > jun@confluent.io>
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, Jay,
> > > > > > > > > > >
> > > > > > > > > > > 2. Regarding request.unit vs request.percentage. I
> > started
> > > > with
> > > > > > > > > > > request.percentage too. The reasoning for request.unit
> is
> > > the
> > > > > > > > > following.
> > > > > > > > > > > Suppose that the capacity has been reached on a broker
> > and
> > > > the
> > > > > > > admin
> > > > > > > > > > needs
> > > > > > > > > > > to add a new user. A simple way to increase the
> capacity
> > is
> > > > to
> > > > > > > > increase
> > > > > > > > > > the
> > > > > > > > > > > number of io threads, assuming there are still enough
> > > cores.
> > > > If
> > > > > > the
> > > > > > > > > limit
> > > > > > > > > > > is based on percentage, the additional capacity
> > > automatically
> > > > > > gets
> > > > > > > > > > > distributed to existing users and we haven't really
> > carved
> > > > out
> > > > > > any
> > > > > > > > > > > additional resource for the new user. Now, is it easy
> > for a
> > > > > user
> > > > > > to
> > > > > > > > > > reason
> > > > > > > > > > > about 0.1 unit vs 10%. My feeling is that both are hard
> > and
> > > > > have
> > > > > > to
> > > > > > > > be
> > > > > > > > > > > configured empirically. Not sure if percentage is
> > obviously
> > > > > > easier
> > > > > > > to
> > > > > > > > > > > reason about.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <
> > > jay@confluent.io
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > >> A couple of quick points:
> > > > > > > > > > >>
> > > > > > > > > > >> 1. Even though the implementation of this quota is
> only
> > > > using
> > > > > io
> > > > > > > > > thread
> > > > > > > > > > >> time, i think we should call it something like
> > > > "request-time".
> > > > > > > This
> > > > > > > > > will
> > > > > > > > > > >> give us flexibility to improve the implementation to
> > cover
> > > > > > network
> > > > > > > > > > threads
> > > > > > > > > > >> in the future and will avoid exposing internal details
> > > like
> > > > > our
> > > > > > > > thread
> > > > > > > > > > >> pools on the server.
> > > > > > > > > > >>
> > > > > > > > > > >> 2. Jun/Roger, I get what you are trying to fix but the
> > > idea
> > > > of
> > > > > > > > > > >> thread/units
> > > > > > > > > > >> is super unintuitive as a user-facing knob. I had to
> > read
> > > > the
> > > > > > KIP
> > > > > > > > like
> > > > > > > > > > >> eight times to understand this. I'm not sure that your
> > > point
> > > > > > that
> > > > > > > > > > >> increasing the number of threads is a problem with a
> > > > > > > > percentage-based
> > > > > > > > > > >> value, it really depends on whether the user thinks
> > about
> > > > the
> > > > > > > > > > "percentage
> > > > > > > > > > >> of request processing time" or "thread units". If they
> > > think
> > > > > "I
> > > > > > > have
> > > > > > > > > > >> allocated 10% of my request processing time to user x"
> > > then
> > > > it
> > > > > > is
> > > > > > > a
> > > > > > > > > bug
> > > > > > > > > > >> that increasing the thread count decreases that
> percent
> > as
> > > > it
> > > > > > does
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > >> current proposal. As a practical matter I think the
> only
> > > way
> > > > > to
> > > > > > > > > actually
> > > > > > > > > > >> reason about this is as a percent---I just don't
> believe
> > > > > people
> > > > > > > are
> > > > > > > > > > going
> > > > > > > > > > >> to think, "ah, 4.3 thread units, that is the right
> > > amount!".
> > > > > > > > Instead I
> > > > > > > > > > >> think they have to understand this thread unit
> concept,
> > > > figure
> > > > > > out
> > > > > > > > > what
> > > > > > > > > > >> they have set in number of threads, compute a percent
> > and
> > > > then
> > > > > > > come
> > > > > > > > up
> > > > > > > > > > >> with
> > > > > > > > > > >> the number of thread units, and these will all be
> wrong
> > if
> > > > > that
> > > > > > > > thread
> > > > > > > > > > >> count changes. I also think this ties us to throttling
> > the
> > > > I/O
> > > > > > > > thread
> > > > > > > > > > >> pool,
> > > > > > > > > > >> which may not be where we want to end up.
> > > > > > > > > > >>
> > > > > > > > > > >> 3. For what it's worth I do think having a single
> > > > throttle_ms
> > > > > > > field
> > > > > > > > in
> > > > > > > > > > all
> > > > > > > > > > >> the responses that combines all throttling from all
> > quotas
> > > > is
> > > > > > > > probably
> > > > > > > > > > the
> > > > > > > > > > >> simplest. There could be a use case for having
> separate
> > > > fields
> > > > > > for
> > > > > > > > > each,
> > > > > > > > > > >> but I think that is actually harder to use/monitor in
> > the
> > > > > common
> > > > > > > > case
> > > > > > > > > so
> > > > > > > > > > >> unless someone has a use case I think just one should
> be
> > > > fine.
> > > > > > > > > > >>
> > > > > > > > > > >> -Jay
> > > > > > > > > > >>
> > > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > > >> wrote:
> > > > > > > > > > >>
> > > > > > > > > > >> > I have updated the KIP based on the discussions so
> > far.
> > > > > > > > > > >> >
> > > > > > > > > > >> >
> > > > > > > > > > >> > Regards,
> > > > > > > > > > >> >
> > > > > > > > > > >> > Rajini
> > > > > > > > > > >> >
> > > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > > >> > wrote:
> > > > > > > > > > >> >
> > > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Ismael #1. It makes sense not to throttle
> > inter-broker
> > > > > > > requests
> > > > > > > > > like
> > > > > > > > > > >> > > LeaderAndIsr etc. The simplest way to ensure that
> > > > clients
> > > > > > > cannot
> > > > > > > > > use
> > > > > > > > > > >> > these
> > > > > > > > > > >> > > requests to bypass quotas for DoS attacks is to
> > ensure
> > > > > that
> > > > > > > ACLs
> > > > > > > > > > >> prevent
> > > > > > > > > > >> > > clients from using these requests and unauthorized
> > > > > requests
> > > > > > > are
> > > > > > > > > > >> included
> > > > > > > > > > >> > > towards quotas.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking that these
> quotas
> > > can
> > > > > > > return
> > > > > > > > a
> > > > > > > > > > >> > separate
> > > > > > > > > > >> > > throttle time, and all utilization based quotas
> > could
> > > > use
> > > > > > the
> > > > > > > > same
> > > > > > > > > > >> field
> > > > > > > > > > >> > > (we won't add another one for network thread
> > > utilization
> > > > > for
> > > > > > > > > > >> instance).
> > > > > > > > > > >> > But
> > > > > > > > > > >> > > perhaps it makes sense to keep byte rate quotas
> > > separate
> > > > > in
> > > > > > > > > > >> produce/fetch
> > > > > > > > > > >> > > responses to provide separate metrics? Agree with
> > > Ismael
> > > > > > that
> > > > > > > > the
> > > > > > > > > > >> name of
> > > > > > > > > > >> > > the existing field should be changed if we have
> two.
> > > > Happy
> > > > > > to
> > > > > > > > > switch
> > > > > > > > > > >> to a
> > > > > > > > > > >> > > single combined throttle time if that is
> sufficient.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot
> > > > separated
> > > > > > > name
> > > > > > > > > for
> > > > > > > > > > >> new
> > > > > > > > > > >> > > property. Replication quotas use dot separated, so
> > it
> > > > will
> > > > > > be
> > > > > > > > > > >> consistent
> > > > > > > > > > >> > > with all properties except byte rate quotas.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Radai: #1 Request processing time rather than
> > request
> > > > rate
> > > > > > > were
> > > > > > > > > > chosen
> > > > > > > > > > >> > > because the time per request can vary
> significantly
> > > > > between
> > > > > > > > > requests
> > > > > > > > > > >> as
> > > > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > > > >> > > #2 Two separate quotas for heartbeats/regular
> > requests
> > > > > feel
> > > > > > > like
> > > > > > > > > > more
> > > > > > > > > > >> > > configuration and more metrics. Since most users
> > would
> > > > set
> > > > > > > > quotas
> > > > > > > > > > >> higher
> > > > > > > > > > >> > > than the expected usage and quotas are more of a
> > > safety
> > > > > > net, a
> > > > > > > > > > single
> > > > > > > > > > >> > quota
> > > > > > > > > > >> > > should work in most cases.
> > > > > > > > > > >> > >  #3 The number of requests in purgatory is limited
> > by
> > > > the
> > > > > > > number
> > > > > > > > > of
> > > > > > > > > > >> > active
> > > > > > > > > > >> > > connections since only one request per connection
> > will
> > > > be
> > > > > > > > > throttled
> > > > > > > > > > >> at a
> > > > > > > > > > >> > > time.
> > > > > > > > > > >> > > #4 As with byte rate quotas, to use the full
> > allocated
> > > > > > quotas,
> > > > > > > > > > >> > > clients/users would need to use partitions that
> are
> > > > > > > distributed
> > > > > > > > > > across
> > > > > > > > > > >> > the
> > > > > > > > > > >> > > cluster. The alternative of using cluster-wide
> > quotas
> > > > > > instead
> > > > > > > of
> > > > > > > > > > >> > per-broker
> > > > > > > > > > >> > > quotas would be far too complex to implement.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Dong : We currently have two ClientQuotaManagers
> for
> > > > quota
> > > > > > > types
> > > > > > > > > > Fetch
> > > > > > > > > > >> > and
> > > > > > > > > > >> > > Produce. A new one will be added for IOThread,
> which
> > > > > manages
> > > > > > > > > quotas
> > > > > > > > > > >> for
> > > > > > > > > > >> > I/O
> > > > > > > > > > >> > > thread utilization. This will not update the Fetch
> > or
> > > > > > Produce
> > > > > > > > > > >> queue-size,
> > > > > > > > > > >> > > but will have a separate metric for the
> > queue-size.  I
> > > > > > wasn't
> > > > > > > > > > >> planning to
> > > > > > > > > > >> > > add any additional metrics apart from the
> equivalent
> > > > ones
> > > > > > for
> > > > > > > > > > existing
> > > > > > > > > > >> > > quotas as part of this KIP. Ratio of byte-rate to
> > I/O
> > > > > thread
> > > > > > > > > > >> utilization
> > > > > > > > > > >> > > could be slightly misleading since it depends on
> the
> > > > > > sequence
> > > > > > > of
> > > > > > > > > > >> > requests.
> > > > > > > > > > >> > > But we can look into more metrics after the KIP is
> > > > > > implemented
> > > > > > > > if
> > > > > > > > > > >> > required.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > I think we need to limit the maximum delay since
> all
> > > > > > requests
> > > > > > > > are
> > > > > > > > > > >> > > throttled. If a client has a quota of 0.001 units
> > and
> > > a
> > > > > > single
> > > > > > > > > > request
> > > > > > > > > > >> > used
> > > > > > > > > > >> > > 50ms, we don't want to delay all requests from the
> > > > client
> > > > > by
> > > > > > > 50
> > > > > > > > > > >> seconds,
> > > > > > > > > > >> > > throwing the client out of all its consumer
> groups.
> > > The
> > > > > > issue
> > > > > > > is
> > > > > > > > > > only
> > > > > > > > > > >> if
> > > > > > > > > > >> > a
> > > > > > > > > > >> > > user is allocated a quota that is insufficient to
> > > > process
> > > > > > one
> > > > > > > > > large
> > > > > > > > > > >> > > request. The expectation is that the units
> allocated
> > > per
> > > > > > user
> > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > >> > much
> > > > > > > > > > >> > > higher than the time taken to process one request
> > and
> > > > the
> > > > > > > limit
> > > > > > > > > > should
> > > > > > > > > > >> > > seldom be applied. Agree this needs proper
> > > > documentation.
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Regards,
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > Rajini
> > > > > > > > > > >> > >
> > > > > > > > > > >> > >
> > > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > > >> > wrote:
> > > > > > > > > > >> > >
> > > > > > > > > > >> > >> @jun: i wasnt concerned about tying up a request
> > > > > processing
> > > > > > > > > thread,
> > > > > > > > > > >> but
> > > > > > > > > > >> > >> IIUC the code does still read the entire request
> > out,
> > > > > which
> > > > > > > > might
> > > > > > > > > > >> add-up
> > > > > > > > > > >> > >> to
> > > > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > > > >> > >>
> > > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> > > > > > > > lindong28@gmail.com>
> > > > > > > > > > >> wrote:
> > > > > > > > > > >> > >>
> > > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > The current KIP says that the maximum delay
> will
> > be
> > > > > > reduced
> > > > > > > > to
> > > > > > > > > > >> window
> > > > > > > > > > >> > >> size
> > > > > > > > > > >> > >> > if it is larger than the window size. I have a
> > > > concern
> > > > > > with
> > > > > > > > > this:
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > 1) This essentially means that the user is
> > allowed
> > > to
> > > > > > > exceed
> > > > > > > > > > their
> > > > > > > > > > >> > quota
> > > > > > > > > > >> > >> > over a long period of time. Can you provide an
> > > upper
> > > > > > bound
> > > > > > > on
> > > > > > > > > > this
> > > > > > > > > > >> > >> > deviation?
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > 2) What is the motivation for cap the maximum
> > delay
> > > > by
> > > > > > the
> > > > > > > > > window
> > > > > > > > > > >> > size?
> > > > > > > > > > >> > >> I
> > > > > > > > > > >> > >> > am wondering if there is better alternative to
> > > > address
> > > > > > the
> > > > > > > > > > problem.
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > 3) It means that the existing metric-related
> > config
> > > > > will
> > > > > > > > have a
> > > > > > > > > > >> more
> > > > > > > > > > >> > >> > directly impact on the mechanism of this
> > > > > > > io-thread-unit-based
> > > > > > > > > > >> quota.
> > > > > > > > > > >> > The
> > > > > > > > > > >> > >> > may be an important change depending on the
> > answer
> > > to
> > > > > 1)
> > > > > > > > above.
> > > > > > > > > > We
> > > > > > > > > > >> > >> probably
> > > > > > > > > > >> > >> > need to document this more explicitly.
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > Dong
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> > > > > > > > > lindong28@gmail.com>
> > > > > > > > > > >> > wrote:
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > Yeah you are right. I thought it wasn't
> because
> > > at
> > > > > > > LinkedIn
> > > > > > > > > it
> > > > > > > > > > >> will
> > > > > > > > > > >> > be
> > > > > > > > > > >> > >> > too
> > > > > > > > > > >> > >> > > much pressure on inGraph to expose those
> > > > per-clientId
> > > > > > > > metrics
> > > > > > > > > > so
> > > > > > > > > > >> we
> > > > > > > > > > >> > >> ended
> > > > > > > > > > >> > >> > > up printing them periodically to local log.
> > Never
> > > > > mind
> > > > > > if
> > > > > > > > it
> > > > > > > > > is
> > > > > > > > > > >> not
> > > > > > > > > > >> > a
> > > > > > > > > > >> > >> > > general problem.
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > - I agree with Jay that we probably don't
> want
> > to
> > > > > add a
> > > > > > > new
> > > > > > > > > > field
> > > > > > > > > > >> > for
> > > > > > > > > > >> > >> > > every quota ProduceResponse or FetchResponse.
> > Is
> > > > > there
> > > > > > > any
> > > > > > > > > > >> use-case
> > > > > > > > > > >> > >> for
> > > > > > > > > > >> > >> > > having separate throttle-time fields for
> > > > > > byte-rate-quota
> > > > > > > > and
> > > > > > > > > > >> > >> > > io-thread-unit-quota? You probably need to
> > > document
> > > > > > this
> > > > > > > as
> > > > > > > > > > >> > interface
> > > > > > > > > > >> > >> > > change if you plan to add new field in any
> > > request.
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > - I don't think IOThread belongs to
> quotaType.
> > > The
> > > > > > > existing
> > > > > > > > > > quota
> > > > > > > > > > >> > >> types
> > > > > > > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > > > > > > n/FollowerReplication)
> > > > > > > > > > >> identify
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > > type of request that are throttled, not the
> > quota
> > > > > > > mechanism
> > > > > > > > > > that
> > > > > > > > > > >> is
> > > > > > > > > > >> > >> > applied.
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > - If a request is throttled due to this
> > > > > > > > io-thread-unit-based
> > > > > > > > > > >> quota,
> > > > > > > > > > >> > is
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > > existing queue-size metric in
> > ClientQuotaManager
> > > > > > > > incremented?
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > - In the interest of providing guide line for
> > > admin
> > > > > to
> > > > > > > > decide
> > > > > > > > > > >> > >> > > io-thread-unit-based quota and for user to
> > > > understand
> > > > > > its
> > > > > > > > > > impact
> > > > > > > > > > >> on
> > > > > > > > > > >> > >> their
> > > > > > > > > > >> > >> > > traffic, would it be useful to have a metric
> > that
> > > > > shows
> > > > > > > the
> > > > > > > > > > >> overall
> > > > > > > > > > >> > >> > > byte-rate per io-thread-unit? Can we also
> show
> > > > this a
> > > > > > > > > > >> per-clientId
> > > > > > > > > > >> > >> > metric?
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > > >> > >> > > Dong
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <
> > > > > > > jun@confluent.io
> > > > > > > > >
> > > > > > > > > > >> wrote:
> > > > > > > > > > >> > >> > >
> > > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> For #3, typically, an admin won't configure
> > more
> > > > io
> > > > > > > > threads
> > > > > > > > > > than
> > > > > > > > > > >> > CPU
> > > > > > > > > > >> > >> > >> cores,
> > > > > > > > > > >> > >> > >> but it's possible for an admin to start with
> > > fewer
> > > > > io
> > > > > > > > > threads
> > > > > > > > > > >> than
> > > > > > > > > > >> > >> cores
> > > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> I think the throttleTime sensor on the
> broker
> > > > tells
> > > > > > the
> > > > > > > > > admin
> > > > > > > > > > >> > >> whether a
> > > > > > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> The reasoning for delaying the throttled
> > > requests
> > > > on
> > > > > > the
> > > > > > > > > > broker
> > > > > > > > > > >> > >> instead
> > > > > > > > > > >> > >> > of
> > > > > > > > > > >> > >> > >> returning an error immediately is that the
> > > latter
> > > > > has
> > > > > > no
> > > > > > > > way
> > > > > > > > > > to
> > > > > > > > > > >> > >> prevent
> > > > > > > > > > >> > >> > >> the
> > > > > > > > > > >> > >> > >> client from retrying immediately, which will
> > > make
> > > > > > things
> > > > > > > > > > worse.
> > > > > > > > > > >> The
> > > > > > > > > > >> > >> > >> delaying logic is based off a delay queue. A
> > > > > separate
> > > > > > > > > > expiration
> > > > > > > > > > >> > >> thread
> > > > > > > > > > >> > >> > >> just waits on the next to be expired
> request.
> > > So,
> > > > it
> > > > > > > > doesn't
> > > > > > > > > > tie
> > > > > > > > > > >> > up a
> > > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> Jun
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael
> Juma <
> > > > > > > > > > ismael@juma.me.uk
> > > > > > > > > > >> >
> > > > > > > > > > >> > >> wrote:
> > > > > > > > > > >> > >> > >>
> > > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > > >> > >> > >> >
> > > > > > > > > > >> > >> > >> > Regarding 1, I definitely like the
> > simplicity
> > > of
> > > > > > > > keeping a
> > > > > > > > > > >> single
> > > > > > > > > > >> > >> > >> throttle
> > > > > > > > > > >> > >> > >> > time field in the response. The downside
> is
> > > that
> > > > > the
> > > > > > > > > client
> > > > > > > > > > >> > metrics
> > > > > > > > > > >> > >> > >> will be
> > > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > > >> > >> > >> >
> > > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > > `leader.imbalance.per.broker.
> > > > > > > > > > percentage`
> > > > > > > > > > >> > and
> > > > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > > > > > > >> > >> > >> >
> > > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > > >> > >> > >> >
> > > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay
> Kreps <
> > > > > > > > > > jay@confluent.io>
> > > > > > > > > > >> > >> wrote:
> > > > > > > > > > >> > >> > >> >
> > > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > >> > >> > >> > >    1. Isn't it the case that the
> > throttling
> > > > time
> > > > > > > > > response
> > > > > > > > > > >> field
> > > > > > > > > > >> > >> > should
> > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > >> > >> > >> > >    the total time your request was
> > throttled
> > > > > > > > > irrespective
> > > > > > > > > > of
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > >> quotas
> > > > > > > > > > >> > >> > >> > > that
> > > > > > > > > > >> > >> > >> > >    caused that. Limiting it to byte rate
> > > quota
> > > > > > > doesn't
> > > > > > > > > > make
> > > > > > > > > > >> > >> sense,
> > > > > > > > > > >> > >> > >> but I
> > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > >> > >> > >> > >    I don't think we want to end up
> adding
> > > new
> > > > > > fields
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > >> > >> response
> > > > > > > > > > >> > >> > >> for
> > > > > > > > > > >> > >> > >> > > every
> > > > > > > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > > > > > > >> > >> > >> > >    2. I don't think we should make this
> > > quota
> > > > > > > > > specifically
> > > > > > > > > > >> > about
> > > > > > > > > > >> > >> io
> > > > > > > > > > >> > >> > >> > >    threads. Once we introduce these
> quotas
> > > > > people
> > > > > > > set
> > > > > > > > > them
> > > > > > > > > > >> and
> > > > > > > > > > >> > >> > expect
> > > > > > > > > > >> > >> > >> > them
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > >    be enforced (and if they aren't it
> may
> > > > cause
> > > > > an
> > > > > > > > > > outage).
> > > > > > > > > > >> As
> > > > > > > > > > >> > a
> > > > > > > > > > >> > >> > >> result
> > > > > > > > > > >> > >> > >> > > they
> > > > > > > > > > >> > >> > >> > >    are a bit more sensitive than normal
> > > > > configs, I
> > > > > > > > > think.
> > > > > > > > > > >> The
> > > > > > > > > > >> > >> > current
> > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > >> > >> > >> > >    pools seem like something of an
> > > > > implementation
> > > > > > > > detail
> > > > > > > > > > and
> > > > > > > > > > >> > not
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > >> > level
> > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > >> > >> > >> > >    user-facing quotas should be involved
> > > > with. I
> > > > > > > think
> > > > > > > > > it
> > > > > > > > > > >> might
> > > > > > > > > > >> > >> be
> > > > > > > > > > >> > >> > >> better
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > >    make this a general request-time
> > throttle
> > > > > with
> > > > > > no
> > > > > > > > > > >> mention in
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > >> > naming
> > > > > > > > > > >> > >> > >> > >    about I/O threads and simply
> > acknowledge
> > > > the
> > > > > > > > current
> > > > > > > > > > >> > >> limitation
> > > > > > > > > > >> > >> > >> (which
> > > > > > > > > > >> > >> > >> > > we
> > > > > > > > > > >> > >> > >> > >    may someday fix) in the docs that
> this
> > > > covers
> > > > > > > only
> > > > > > > > > the
> > > > > > > > > > >> time
> > > > > > > > > > >> > >> after
> > > > > > > > > > >> > >> > >> the
> > > > > > > > > > >> > >> > >> > >    thread is read off the network.
> > > > > > > > > > >> > >> > >> > >    3. As such I think the right
> interface
> > to
> > > > the
> > > > > > > user
> > > > > > > > > > would
> > > > > > > > > > >> be
> > > > > > > > > > >> > >> > >> something
> > > > > > > > > > >> > >> > >> > >    like percent_request_time and be in
> > > > > {0,...100}
> > > > > > or
> > > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is
> > the
> > > > > > > > terminology
> > > > > > > > > we
> > > > > > > > > > >> used
> > > > > > > > > > >> > >> if
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > > >> > >> > >> > >    is between 0 and 1 in the other
> > metrics,
> > > > > > right?)
> > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini
> > > > Sivaram
> > > > > <
> > > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > > >> > >> > >> > >
> > > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Guozhang : I have updated the section
> on
> > > > > > > > co-existence
> > > > > > > > > of
> > > > > > > > > > >> byte
> > > > > > > > > > >> > >> rate
> > > > > > > > > > >> > >> > >> and
> > > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Dong: I hadn't added much detail to
> the
> > > > > metrics
> > > > > > > and
> > > > > > > > > > >> sensors
> > > > > > > > > > >> > >> since
> > > > > > > > > > >> > >> > >> they
> > > > > > > > > > >> > >> > >> > > are
> > > > > > > > > > >> > >> > >> > > > going to be very similar to the
> existing
> > > > > metrics
> > > > > > > and
> > > > > > > > > > >> sensors.
> > > > > > > > > > >> > >> To
> > > > > > > > > > >> > >> > >> avoid
> > > > > > > > > > >> > >> > >> > > > confusion, I have now added more
> detail.
> > > All
> > > > > > > metrics
> > > > > > > > > are
> > > > > > > > > > >> in
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > >> group
> > > > > > > > > > >> > >> > >> > > > "quotaType" and all sensors have names
> > > > > starting
> > > > > > > with
> > > > > > > > > > >> > >> "quotaType"
> > > > > > > > > > >> > >> > >> (where
> > > > > > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/
> > > > LeaderReplication/
> > > > > > > > > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > > > > > > > > >> > >> > >> > > > So there will be no reuse of existing
> > > > > > > > metrics/sensors.
> > > > > > > > > > The
> > > > > > > > > > >> > new
> > > > > > > > > > >> > >> > ones
> > > > > > > > > > >> > >> > >> for
> > > > > > > > > > >> > >> > >> > > > request processing time based
> throttling
> > > > will
> > > > > be
> > > > > > > > > > >> completely
> > > > > > > > > > >> > >> > >> independent
> > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > >> > >> > >> > > > existing metrics/sensors, but will be
> > > > > consistent
> > > > > > > in
> > > > > > > > > > >> format.
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > The existing throttle_time_ms field in
> > > > > > > produce/fetch
> > > > > > > > > > >> > responses
> > > > > > > > > > >> > >> > will
> > > > > > > > > > >> > >> > >> not
> > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > >> > >> > >> > > > impacted by this KIP. That will
> continue
> > > to
> > > > > > return
> > > > > > > > > > >> byte-rate
> > > > > > > > > > >> > >> based
> > > > > > > > > > >> > >> > >> > > > throttling times. In addition, a new
> > field
> > > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > >> > >> > >> > > > added to return request quota based
> > > > throttling
> > > > > > > > times.
> > > > > > > > > > >> These
> > > > > > > > > > >> > >> will
> > > > > > > > > > >> > >> > be
> > > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > > >> > >> > >> > > > as new metrics on the client-side.
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Since all metrics and sensors are
> > > different
> > > > > for
> > > > > > > each
> > > > > > > > > > type
> > > > > > > > > > >> of
> > > > > > > > > > >> > >> > quota,
> > > > > > > > > > >> > >> > >> I
> > > > > > > > > > >> > >> > >> > > > believe there is already sufficient
> > > metrics
> > > > to
> > > > > > > > monitor
> > > > > > > > > > >> > >> throttling
> > > > > > > > > > >> > >> > on
> > > > > > > > > > >> > >> > >> > both
> > > > > > > > > > >> > >> > >> > > > client and broker side for each type
> of
> > > > > > > throttling.
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong
> > Lin
> > > <
> > > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > >> > >> > >> > > >
> > > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > I think it makes a lot of sense to
> use
> > > > > > > > > io_thread_units
> > > > > > > > > > >> as
> > > > > > > > > > >> > >> metric
> > > > > > > > > > >> > >> > >> to
> > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > >> > >> > >> > > > > user's traffic here. LGTM overall. I
> > > have
> > > > > some
> > > > > > > > > > questions
> > > > > > > > > > >> > >> > regarding
> > > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > - Can you be more specific in the
> KIP
> > > what
> > > > > > > sensors
> > > > > > > > > > will
> > > > > > > > > > >> be
> > > > > > > > > > >> > >> > added?
> > > > > > > > > > >> > >> > >> For
> > > > > > > > > > >> > >> > >> > > > > example, it will be useful to
> specify
> > > the
> > > > > name
> > > > > > > and
> > > > > > > > > > >> > >> attributes of
> > > > > > > > > > >> > >> > >> > these
> > > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > - We currently have throttle-time
> and
> > > > > > queue-size
> > > > > > > > for
> > > > > > > > > > >> > >> byte-rate
> > > > > > > > > > >> > >> > >> based
> > > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > > >> > >> > >> > > > > Are you going to have separate
> > > > throttle-time
> > > > > > and
> > > > > > > > > > >> queue-size
> > > > > > > > > > >> > >> for
> > > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > > >> > >> > >> > > > > throttled by io_thread_unit-based
> > quota,
> > > > or
> > > > > > will
> > > > > > > > > they
> > > > > > > > > > >> share
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > >> same
> > > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > - Does the throttle-time in the
> > > > > > ProduceResponse
> > > > > > > > and
> > > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > > >> > >> > >> > > > > time due to io_thread_unit-based
> > quota?
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > - Currently kafka server doesn't not
> > > > provide
> > > > > > any
> > > > > > > > log
> > > > > > > > > > or
> > > > > > > > > > >> > >> metrics
> > > > > > > > > > >> > >> > >> that
> > > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > > >> > >> > >> > > > > whether any given clientId (or user)
> > is
> > > > > > > throttled.
> > > > > > > > > > This
> > > > > > > > > > >> is
> > > > > > > > > > >> > >> not
> > > > > > > > > > >> > >> > too
> > > > > > > > > > >> > >> > >> > bad
> > > > > > > > > > >> > >> > >> > > > > because we can still check the
> > > client-side
> > > > > > > > byte-rate
> > > > > > > > > > >> metric
> > > > > > > > > > >> > >> to
> > > > > > > > > > >> > >> > >> > validate
> > > > > > > > > > >> > >> > >> > > > > whether a given client is throttled.
> > But
> > > > > with
> > > > > > > this
> > > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > > >> > >> > >> > > there
> > > > > > > > > > >> > >> > >> > > > > will be no way to validate whether a
> > > given
> > > > > > > client
> > > > > > > > is
> > > > > > > > > > >> slow
> > > > > > > > > > >> > >> > because
> > > > > > > > > > >> > >> > >> it
> > > > > > > > > > >> > >> > >> > > has
> > > > > > > > > > >> > >> > >> > > > > exceeded its io_thread_unit limit.
> It
> > is
> > > > > > > necessary
> > > > > > > > > for
> > > > > > > > > > >> user
> > > > > > > > > > >> > >> to
> > > > > > > > > > >> > >> > be
> > > > > > > > > > >> > >> > >> > able
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > > > > know this information to figure how
> > > > whether
> > > > > > they
> > > > > > > > > have
> > > > > > > > > > >> > reached
> > > > > > > > > > >> > >> > >> there
> > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > >> > >> > >> > > > > limit. How about we add log4j log on
> > the
> > > > > > server
> > > > > > > > side
> > > > > > > > > > to
> > > > > > > > > > >> > >> > >> periodically
> > > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > > >> > >> > >> > > > > the (client_id,
> > byte-rate-throttle-time,
> > > > > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > > > > >> > >> > >> > > so
> > > > > > > > > > >> > >> > >> > > > > that kafka administrator can figure
> > > those
> > > > > > users
> > > > > > > > that
> > > > > > > > > > >> have
> > > > > > > > > > >> > >> > reached
> > > > > > > > > > >> > >> > >> > their
> > > > > > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM,
> > > Guozhang
> > > > > > Wang <
> > > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >> > >> > >> > > > > > Made a pass over the doc, overall
> > LGTM
> > > > > > except
> > > > > > > a
> > > > > > > > > > minor
> > > > > > > > > > >> > >> comment
> > > > > > > > > > >> > >> > on
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > > Stated as "Request processing time
> > > > > > throttling
> > > > > > > > will
> > > > > > > > > > be
> > > > > > > > > > >> > >> applied
> > > > > > > > > > >> > >> > on
> > > > > > > > > > >> > >> > >> > top
> > > > > > > > > > >> > >> > >> > > if
> > > > > > > > > > >> > >> > >> > > > > > necessary." I thought that it
> meant
> > > the
> > > > > > > request
> > > > > > > > > > >> > processing
> > > > > > > > > > >> > >> > time
> > > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > > >> > >> > >> > > > > > is applied first, but continue
> > > reading I
> > > > > > found
> > > > > > > > it
> > > > > > > > > > >> > actually
> > > > > > > > > > >> > >> > >> meant to
> > > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > > >> > >> > >> > > > > > produce / fetch byte rate
> throttling
> > > > > first.
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > > Also the last sentence "The
> > remaining
> > > > > delay
> > > > > > if
> > > > > > > > any
> > > > > > > > > > is
> > > > > > > > > > >> > >> applied
> > > > > > > > > > >> > >> > to
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > > response." is a bit confusing to
> me.
> > > > Maybe
> > > > > > > > > rewording
> > > > > > > > > > >> it a
> > > > > > > > > > >> > >> bit?
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM,
> Jun
> > > > Rao <
> > > > > > > > > > >> > jun@confluent.io
> > > > > > > > > > >> > >> >
> > > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > Thanks for the updated KIP. The
> > > latest
> > > > > > > > proposal
> > > > > > > > > > >> looks
> > > > > > > > > > >> > >> good
> > > > > > > > > > >> > >> > to
> > > > > > > > > > >> > >> > >> me.
> > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM,
> > > > Rajini
> > > > > > > > Sivaram
> > > > > > > > > <
> > > > > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > Thank you for the feedback.
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > 1. I have updated the KIP to
> use
> > > > > > absolute
> > > > > > > > > units
> > > > > > > > > > >> > >> instead of
> > > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > > >> > >> > >> > > > > > > > property is called*
> > > io_thread_units*
> > > > > to
> > > > > > > > align
> > > > > > > > > > with
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > >> thread
> > > > > > > > > > >> > >> > >> > > count
> > > > > > > > > > >> > >> > >> > > > > > > > property *num.io.threads*.
> When
> > we
> > > > > > > implement
> > > > > > > > > > >> network
> > > > > > > > > > >> > >> > thread
> > > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > > >> > >> > >> > > > > > > > quotas, we can add another
> > > property
> > > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is
> already
> > > > > listed
> > > > > > > > under
> > > > > > > > > > the
> > > > > > > > > > >> > >> exempt
> > > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > > >> > >> > >> > > > > > > > you mean a different request
> > that
> > > > > needs
> > > > > > to
> > > > > > > > be
> > > > > > > > > > >> added?
> > > > > > > > > > >> > >> The
> > > > > > > > > > >> > >> > >> four
> > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > >> > >> > >> > > > > > > > currently exempt in the KIP
> are
> > > > > > > StopReplica,
> > > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and
> UpdateMetadata.
> > > > These
> > > > > > are
> > > > > > > > > > >> controlled
> > > > > > > > > > >> > >> > using
> > > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy to exclude
> > and
> > > > only
> > > > > > > > > throttle
> > > > > > > > > > if
> > > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > > >> > >> > >> > > I
> > > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > > >> > >> > >> > > > > > > > sure if there are other
> requests
> > > > used
> > > > > > only
> > > > > > > > for
> > > > > > > > > > >> > >> > inter-broker
> > > > > > > > > > >> > >> > >> > that
> > > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > 3. I was thinking the smallest
> > > > change
> > > > > > > would
> > > > > > > > be
> > > > > > > > > > to
> > > > > > > > > > >> > >> replace
> > > > > > > > > > >> > >> > >> all
> > > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > *requestChannel.sendResponse()
> *
> > > > with
> > > > > a
> > > > > > > > local
> > > > > > > > > > >> method
> > > > > > > > > > >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()*
> > that
> > > > > does
> > > > > > > the
> > > > > > > > > > >> > throttling
> > > > > > > > > > >> > >> if
> > > > > > > > > > >> > >> > >> any
> > > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > > >> > >> > >> > > > > > > > response. If we throttle first
> > in
> > > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > > >> > >> > >> > > > > > > > within the method handling the
> > > > request
> > > > > > > will
> > > > > > > > > not
> > > > > > > > > > be
> > > > > > > > > > >> > >> > recorded
> > > > > > > > > > >> > >> > >> or
> > > > > > > > > > >> > >> > >> > > used
> > > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > > >> > >> > >> > > > > > > > throttling. We can look into
> > this
> > > > > again
> > > > > > > when
> > > > > > > > > the
> > > > > > > > > > >> PR
> > > > > > > > > > >> > is
> > > > > > > > > > >> > >> > ready
> > > > > > > > > > >> > >> > >> > for
> > > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55
> PM,
> > > > Roger
> > > > > > > > Hoover
> > > > > > > > > <
> > > > > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > Great to see this KIP and
> the
> > > > > > excellent
> > > > > > > > > > >> discussion.
> > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's suggestion
> makes
> > > > sense.
> > > > > > If
> > > > > > > > my
> > > > > > > > > > >> > >> application
> > > > > > > > > > >> > >> > is
> > > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > > >> > >> > >> > > > > > > > > request handler unit, then
> > it's
> > > as
> > > > > if
> > > > > > I
> > > > > > > > > have a
> > > > > > > > > > >> > Kafka
> > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > > >> > >> > >> > > > > > > > > request handler thread
> > dedicated
> > > > to
> > > > > > me.
> > > > > > > > > > That's
> > > > > > > > > > >> the
> > > > > > > > > > >> > >> > most I
> > > > > > > > > > >> > >> > >> > can
> > > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > > >> > >> > >> > > > > > > > > least.  That allocation
> > doesn't
> > > > > change
> > > > > > > > even
> > > > > > > > > if
> > > > > > > > > > >> an
> > > > > > > > > > >> > >> admin
> > > > > > > > > > >> > >> > >> later
> > > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > size of the request thread
> > pool
> > > on
> > > > > the
> > > > > > > > > broker.
> > > > > > > > > > >> > It's
> > > > > > > > > > >> > >> > >> similar
> > > > > > > > > > >> > >> > >> > to
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > > >> > >> > >> > > > > > > > > abstraction that VMs and
> > > > containers
> > > > > > get
> > > > > > > > from
> > > > > > > > > > >> > >> hypervisors
> > > > > > > > > > >> > >> > >> or
> > > > > > > > > > >> > >> > >> > OS
> > > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > > >> > >> > >> > > > > > > > > While different client
> access
> > > > > patterns
> > > > > > > can
> > > > > > > > > use
> > > > > > > > > > >> > wildly
> > > > > > > > > > >> > >> > >> > different
> > > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > >> > >> > >> > > > > > > > > request thread resources per
> > > > > request,
> > > > > > a
> > > > > > > > > given
> > > > > > > > > > >> > >> > application
> > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > > >> > >> > >> > > > > > > > > have a stable access pattern
> > and
> > > > can
> > > > > > > > figure
> > > > > > > > > > out
> > > > > > > > > > >> > >> > >> empirically
> > > > > > > > > > >> > >> > >> > how
> > > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > > >> > >> > >> > > > > > > > > "request thread units" it
> > needs
> > > to
> > > > > > meet
> > > > > > > > it's
> > > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53
> > AM,
> > > > Jun
> > > > > > > Rao <
> > > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for the updated
> KIP.
> > A
> > > > few
> > > > > > more
> > > > > > > > > > >> comments.
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> > > > > request_time_percent
> > > > > > > is
> > > > > > > > > that
> > > > > > > > > > >> it's
> > > > > > > > > > >> > >> not
> > > > > > > > > > >> > >> > an
> > > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > > >> > >> > >> > > > > > > > > > Let's say you give a user
> a
> > > 10%
> > > > > > limit.
> > > > > > > > If
> > > > > > > > > > the
> > > > > > > > > > >> > admin
> > > > > > > > > > >> > >> > >> doubles
> > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > >> > >> > >> > > > > > > > > > request handler threads,
> > that
> > > > user
> > > > > > now
> > > > > > > > > > >> actually
> > > > > > > > > > >> > has
> > > > > > > > > > >> > >> > >> twice
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > > >> > >> > >> > > > > > > > > > capacity. This may confuse
> > > > people
> > > > > a
> > > > > > > bit.
> > > > > > > > > So,
> > > > > > > > > > >> > >> perhaps
> > > > > > > > > > >> > >> > >> > setting
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > > > based on an absolute
> request
> > > > > thread
> > > > > > > unit
> > > > > > > > > is
> > > > > > > > > > >> > better.
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > 2.
> ControlledShutdownRequest
> > > is
> > > > > also
> > > > > > > an
> > > > > > > > > > >> > >> inter-broker
> > > > > > > > > > >> > >> > >> > request
> > > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > be excluded from
> throttling.
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > 3. Implementation wise, I
> am
> > > > > > wondering
> > > > > > > > if
> > > > > > > > > > it's
> > > > > > > > > > >> > >> simpler
> > > > > > > > > > >> > >> > >> to
> > > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > time throttling first in
> > > > > > > > > KafkaApis.handle().
> > > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > > >> > >> > >> we
> > > > > > > > > > >> > >> > >> > > will
> > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > > >> > >> > >> > > > > > > > > > the throttling logic in
> each
> > > > type
> > > > > of
> > > > > > > > > > request.
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at
> 5:58
> > > AM,
> > > > > > > Rajini
> > > > > > > > > > >> Sivaram <
> > > > > > > > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Thank you for the
> review.
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > I have reverted to the
> > > > original
> > > > > > KIP
> > > > > > > > that
> > > > > > > > > > >> > >> throttles
> > > > > > > > > > >> > >> > >> based
> > > > > > > > > > >> > >> > >> > on
> > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > > >> > >> > >> > > > > > > > > > > utilization. At the
> > moment,
> > > it
> > > > > > uses
> > > > > > > > > > >> percentage,
> > > > > > > > > > >> > >> but
> > > > > > > > > > >> > >> > I
> > > > > > > > > > >> > >> > >> am
> > > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > a fraction (out of 1
> > instead
> > > > of
> > > > > > 100)
> > > > > > > > if
> > > > > > > > > > >> > >> required. I
> > > > > > > > > > >> > >> > >> have
> > > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > > >> > >> > >> > > > > > > > > > > from this discussion to
> > the
> > > > KIP.
> > > > > > > Also
> > > > > > > > > > added
> > > > > > > > > > >> a
> > > > > > > > > > >> > >> > "Future
> > > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > address network thread
> > > > > > utilization.
> > > > > > > > The
> > > > > > > > > > >> > >> > configuration
> > > > > > > > > > >> > >> > >> is
> > > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > > >> > >> > >> > > > > > > > > > > "request_time_percent"
> > with
> > > > the
> > > > > > > > > > expectation
> > > > > > > > > > >> > that
> > > > > > > > > > >> > >> it
> > > > > > > > > > >> > >> > >> can
> > > > > > > > > > >> > >> > >> > > also
> > > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > limit for network thread
> > > > > > utilization
> > > > > > > > > when
> > > > > > > > > > >> that
> > > > > > > > > > >> > is
> > > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > > > users have to set only
> one
> > > > > config
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > >> two
> > > > > > > > > > >> > and
> > > > > > > > > > >> > >> > not
> > > > > > > > > > >> > >> > >> > have
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > > >> > >> > >> > > > > > > > > > > the internal
> distribution
> > of
> > > > the
> > > > > > > work
> > > > > > > > > > >> between
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > two
> > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at
> > > 12:23
> > > > > AM,
> > > > > > > Jun
> > > > > > > > > Rao
> > > > > > > > > > <
> > > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for the
> proposal.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > The benefit of using
> the
> > > > > request
> > > > > > > > > > >> processing
> > > > > > > > > > >> > >> time
> > > > > > > > > > >> > >> > >> over
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > exactly what people
> have
> > > > > said. I
> > > > > > > > will
> > > > > > > > > > just
> > > > > > > > > > >> > >> expand
> > > > > > > > > > >> > >> > >> that
> > > > > > > > > > >> > >> > >> > a
> > > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > following case. The
> > > producer
> > > > > > > sends a
> > > > > > > > > > >> produce
> > > > > > > > > > >> > >> > request
> > > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > but compressed to
> 100KB
> > > with
> > > > > > gzip.
> > > > > > > > The
> > > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > broker could take
> 10-15
> > > > > seconds,
> > > > > > > > > during
> > > > > > > > > > >> which
> > > > > > > > > > >> > >> > time,
> > > > > > > > > > >> > >> > >> a
> > > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > thread is completely
> > > > blocked.
> > > > > In
> > > > > > > > this
> > > > > > > > > > >> case,
> > > > > > > > > > >> > >> > neither
> > > > > > > > > > >> > >> > >> the
> > > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request rate quota
> > may
> > > > be
> > > > > > > > > effective
> > > > > > > > > > in
> > > > > > > > > > >> > >> > >> protecting
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > another case. A
> consumer
> > > > group
> > > > > > > > starts
> > > > > > > > > > >> with 10
> > > > > > > > > > >> > >> > >> instances
> > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > switches to 20
> > instances.
> > > > The
> > > > > > > > request
> > > > > > > > > > rate
> > > > > > > > > > >> > will
> > > > > > > > > > >> > >> > >> likely
> > > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > actually load on the
> > > broker
> > > > > may
> > > > > > > not
> > > > > > > > > > double
> > > > > > > > > > >> > >> since
> > > > > > > > > > >> > >> > >> each
> > > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > contains half of the
> > > > > partitions.
> > > > > > > > > Request
> > > > > > > > > > >> rate
> > > > > > > > > > >> > >> > quota
> > > > > > > > > > >> > >> > >> may
> > > > > > > > > > >> > >> > >> > > not
> > > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure in this
> case.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > What we really want is
> > to
> > > be
> > > > > > able
> > > > > > > to
> > > > > > > > > > >> prevent
> > > > > > > > > > >> > a
> > > > > > > > > > >> > >> > >> client
> > > > > > > > > > >> > >> > >> > > from
> > > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > of the server side
> > > > resources.
> > > > > In
> > > > > > > > this
> > > > > > > > > > >> > >> particular
> > > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > > >> > >> > >> > > this
> > > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > capacity of the
> request
> > > > > handler
> > > > > > > > > > threads. I
> > > > > > > > > > >> > >> agree
> > > > > > > > > > >> > >> > >> that
> > > > > > > > > > >> > >> > >> > it
> > > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > intuitive for the
> users
> > to
> > > > > > > determine
> > > > > > > > > how
> > > > > > > > > > >> to
> > > > > > > > > > >> > set
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > right
> > > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > this is not completely
> > new
> > > > and
> > > > > > has
> > > > > > > > > been
> > > > > > > > > > >> done
> > > > > > > > > > >> > in
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > already. For example,
> > > Linux
> > > > > > > cgroup (
> > > > > > > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > documentation/en-US/Red_Hat_En
> > > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > Resource_Management_Guide/sec-
> > > > > > > > > cpu.html)
> > > > > > > > > > >> has
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > >> > concept
> > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > which specifies the
> > total
> > > > > amount
> > > > > > > of
> > > > > > > > > time
> > > > > > > > > > >> in
> > > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup can
> > run
> > > > > > during a
> > > > > > > > one
> > > > > > > > > > >> second
> > > > > > > > > > >> > >> > >> period.
> > > > > > > > > > >> > >> > >> > We
> > > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > model the request
> > handler
> > > > > > threads
> > > > > > > > in a
> > > > > > > > > > >> > similar
> > > > > > > > > > >> > >> > way.
> > > > > > > > > > >> > >> > >> For
> > > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > request handler thread
> > can
> > > > be
> > > > > 1
> > > > > > > > > request
> > > > > > > > > > >> > handler
> > > > > > > > > > >> > >> > unit
> > > > > > > > > > >> > >> > >> > and
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > configure a limit on
> how
> > > > many
> > > > > > > units
> > > > > > > > > (say
> > > > > > > > > > >> > 0.01)
> > > > > > > > > > >> > >> a
> > > > > > > > > > >> > >> > >> client
> > > > > > > > > > >> > >> > >> > > can
> > > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Regarding not
> throttling
> > > the
> > > > > > > > internal
> > > > > > > > > > >> broker
> > > > > > > > > > >> > to
> > > > > > > > > > >> > >> > >> broker
> > > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > do that.
> Alternatively,
> > we
> > > > > could
> > > > > > > > just
> > > > > > > > > > let
> > > > > > > > > > >> the
> > > > > > > > > > >> > >> > admin
> > > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > for the kafka user (it
> > may
> > > > not
> > > > > > be
> > > > > > > > able
> > > > > > > > > > to
> > > > > > > > > > >> do
> > > > > > > > > > >> > >> that
> > > > > > > > > > >> > >> > >> > easily
> > > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we want to be
> > able
> > > > to
> > > > > > > > protect
> > > > > > > > > > the
> > > > > > > > > > >> > >> > >> utilization
> > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > pool too. The
> difficult
> > is
> > > > > > mostly
> > > > > > > > what
> > > > > > > > > > >> Rajini
> > > > > > > > > > >> > >> > said:
> > > > > > > > > > >> > >> > >> (1)
> > > > > > > > > > >> > >> > >> > > The
> > > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttling the
> requests
> > is
> > > > > > through
> > > > > > > > > > >> Purgatory
> > > > > > > > > > >> > >> and
> > > > > > > > > > >> > >> > we
> > > > > > > > > > >> > >> > >> > will
> > > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > through how to
> integrate
> > > > that
> > > > > > into
> > > > > > > > the
> > > > > > > > > > >> > network
> > > > > > > > > > >> > >> > >> layer.
> > > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > layer, currently we
> know
> > > the
> > > > > > user,
> > > > > > > > but
> > > > > > > > > > not
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > >> clientId
> > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit tricky to
> > > > throttle
> > > > > > > based
> > > > > > > > on
> > > > > > > > > > >> > clientId
> > > > > > > > > > >> > >> > >> there.
> > > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota can already
> > protect
> > > > the
> > > > > > > > network
> > > > > > > > > > >> thread
> > > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > requests. So, if we
> > can't
> > > > > figure
> > > > > > > out
> > > > > > > > > > this
> > > > > > > > > > >> > part
> > > > > > > > > > >> > >> > right
> > > > > > > > > > >> > >> > >> > now,
> > > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > the request handling
> > > threads
> > > > > for
> > > > > > > > this
> > > > > > > > > > KIP
> > > > > > > > > > >> is
> > > > > > > > > > >> > >> > still a
> > > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017
> at
> > > 4:27
> > > > > AM,
> > > > > > > > > Rajini
> > > > > > > > > > >> > >> Sivaram <
> > > > > > > > > > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank you all for
> the
> > > > > > feedback.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have removed
> > > > > exemption
> > > > > > > for
> > > > > > > > > > >> consumer
> > > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > protecting the
> cluster
> > > is
> > > > > more
> > > > > > > > > > important
> > > > > > > > > > >> > than
> > > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have retained the
> > > > exemption
> > > > > > for
> > > > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled only if
> > > > > > authorization
> > > > > > > > > fails
> > > > > > > > > > >> (so
> > > > > > > > > > >> > >> can't
> > > > > > > > > > >> > >> > be
> > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a secure cluster,
> but
> > > > allows
> > > > > > > > > > >> inter-broker
> > > > > > > > > > >> > >> > >> requests to
> > > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > I will wait another
> > day
> > > to
> > > > > see
> > > > > > > if
> > > > > > > > > > these
> > > > > > > > > > >> is
> > > > > > > > > > >> > >> any
> > > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > request processing
> > time
> > > > (as
> > > > > > > > opposed
> > > > > > > > > to
> > > > > > > > > > >> > >> request
> > > > > > > > > > >> > >> > >> rate)
> > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > objections, I will
> > > revert
> > > > to
> > > > > > the
> > > > > > > > > > >> original
> > > > > > > > > > >> > >> > proposal
> > > > > > > > > > >> > >> > >> > with
> > > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > The original
> proposal
> > > was
> > > > > only
> > > > > > > > > > including
> > > > > > > > > > >> > the
> > > > > > > > > > >> > >> > time
> > > > > > > > > > >> > >> > >> > used
> > > > > > > > > > >> > >> > >> > > by
> > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handler threads
> (that
> > > made
> > > > > > > > > calculation
> > > > > > > > > > >> > >> easy). I
> > > > > > > > > > >> > >> > >> think
> > > > > > > > > > >> > >> > >> > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > include the time
> spent
> > > in
> > > > > the
> > > > > > > > > network
> > > > > > > > > > >> > >> threads as
> > > > > > > > > > >> > >> > >> well
> > > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant. As Jay
> > > > pointed
> > > > > > out,
> > > > > > > > it
> > > > > > > > > is
> > > > > > > > > > >> more
> > > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > total available CPU
> > time
> > > > and
> > > > > > > > convert
> > > > > > > > > > to
> > > > > > > > > > >> a
> > > > > > > > > > >> > >> ratio
> > > > > > > > > > >> > >> > >> when
> > > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and *n* network
> > threads.
> > > > > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > > >> > >> > >> > > )
> > > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we want, but it can
> be
> > > > very
> > > > > > > > > expensive
> > > > > > > > > > on
> > > > > > > > > > >> > some
> > > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang have
> pointed
> > > out,
> > > > > we
> > > > > > do
> > > > > > > > > have
> > > > > > > > > > >> > several
> > > > > > > > > > >> > >> > time
> > > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > generating metrics
> > that
> > > we
> > > > > > could
> > > > > > > > > use,
> > > > > > > > > > >> > though
> > > > > > > > > > >> > >> we
> > > > > > > > > > >> > >> > >> might
> > > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime() instead
> of
> > > > > > > > > > >> currentTimeMillis()
> > > > > > > > > > >> > >> since
> > > > > > > > > > >> > >> > >> some
> > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > small requests may
> be
> > <
> > > > 1ms.
> > > > > > But
> > > > > > > > > > rather
> > > > > > > > > > >> > than
> > > > > > > > > > >> > >> add
> > > > > > > > > > >> > >> > >> up
> > > > > > > > > > >> > >> > >> > the
> > > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thread and network
> > > thread,
> > > > > > > > wouldn't
> > > > > > > > > it
> > > > > > > > > > >> be
> > > > > > > > > > >> > >> better
> > > > > > > > > > >> > >> > >> to
> > > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > on each thread into
> a
> > > > > separate
> > > > > > > > > ratio?
> > > > > > > > > > >> UserA
> > > > > > > > > > >> > >> has
> > > > > > > > > > >> > >> > a
> > > > > > > > > > >> > >> > >> > > request
> > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > we take that to mean
> > > that
> > > > > > UserA
> > > > > > > > can
> > > > > > > > > > use
> > > > > > > > > > >> 5%
> > > > > > > > > > >> > of
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > time
> > > > > > > > > > >> > >> > >> > > on
> > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of the time
> on
> > > I/O
> > > > > > > threads?
> > > > > > > > > If
> > > > > > > > > > >> > either
> > > > > > > > > > >> > >> is
> > > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled - it would
> > > mean
> > > > > > > > > maintaining
> > > > > > > > > > >> two
> > > > > > > > > > >> > >> sets
> > > > > > > > > > >> > >> > of
> > > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > durations, but would
> > > > result
> > > > > in
> > > > > > > > more
> > > > > > > > > > >> > >> meaningful
> > > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > > >> > >> > >> > > We
> > > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quota limits (UserA
> > has
> > > 5%
> > > > > of
> > > > > > > > > request
> > > > > > > > > > >> > threads
> > > > > > > > > > >> > >> > and
> > > > > > > > > > >> > >> > >> 10%
> > > > > > > > > > >> > >> > >> > > of
> > > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > but that seems
> > > unnecessary
> > > > > and
> > > > > > > > > harder
> > > > > > > > > > to
> > > > > > > > > > >> > >> explain
> > > > > > > > > > >> > >> > >> to
> > > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back to why and how
> > > quotas
> > > > > are
> > > > > > > > > applied
> > > > > > > > > > >> to
> > > > > > > > > > >> > >> > network
> > > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In the case of
> > fetch,
> > > > > the
> > > > > > > time
> > > > > > > > > > >> spent in
> > > > > > > > > > >> > >> the
> > > > > > > > > > >> > >> > >> > network
> > > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant and I
> can
> > > see
> > > > > the
> > > > > > > need
> > > > > > > > > to
> > > > > > > > > > >> > include
> > > > > > > > > > >> > >> > >> this.
> > > > > > > > > > >> > >> > >> > Are
> > > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > requests where the
> > > network
> > > > > > > thread
> > > > > > > > > > >> > >> utilization is
> > > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > of fetch, request
> > > handler
> > > > > > thread
> > > > > > > > > > >> > utilization
> > > > > > > > > > >> > >> > would
> > > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > high request rate,
> low
> > > > data
> > > > > > > volume
> > > > > > > > > and
> > > > > > > > > > >> > fetch
> > > > > > > > > > >> > >> > byte
> > > > > > > > > > >> > >> > >> > rate
> > > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > clients with high
> data
> > > > > volume.
> > > > > > > > > Network
> > > > > > > > > > >> > thread
> > > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > proportional to the
> > data
> > > > > > > volume. I
> > > > > > > > > am
> > > > > > > > > > >> > >> wondering
> > > > > > > > > > >> > >> > >> if we
> > > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > based on network
> > thread
> > > > > > > > utilization
> > > > > > > > > or
> > > > > > > > > > >> > >> whether
> > > > > > > > > > >> > >> > the
> > > > > > > > > > >> > >> > >> > data
> > > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At the moment, we
> > > > record
> > > > > > and
> > > > > > > > > check
> > > > > > > > > > >> for
> > > > > > > > > > >> > >> quota
> > > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a quota is
> > violated,
> > > > the
> > > > > > > > response
> > > > > > > > > > is
> > > > > > > > > > >> > >> delayed.
> > > > > > > > > > >> > >> > >> > Using
> > > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk reads for
> fetches
> > > > > > happening
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > >> > >> network
> > > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > delay a response
> after
> > > the
> > > > > > disk
> > > > > > > > > reads.
> > > > > > > > > > >> We
> > > > > > > > > > >> > >> could
> > > > > > > > > > >> > >> > >> > record
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > the network thread
> > when
> > > > the
> > > > > > > > response
> > > > > > > > > > is
> > > > > > > > > > >> > >> complete
> > > > > > > > > > >> > >> > >> and
> > > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling a
> subsequent
> > > > > request
> > > > > > > > > > (separate
> > > > > > > > > > >> out
> > > > > > > > > > >> > >> > >> recording
> > > > > > > > > > >> > >> > >> > > and
> > > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling in the case
> > of
> > > > > > network
> > > > > > > > > thread
> > > > > > > > > > >> > >> > overload).
> > > > > > > > > > >> > >> > >> > Does
> > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017
> > at
> > > > 2:58
> > > > > > AM,
> > > > > > > > > > Becket
> > > > > > > > > > >> > Qin <
> > > > > > > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that
> > > > > enforcing
> > > > > > > the
> > > > > > > > > CPU
> > > > > > > > > > >> time
> > > > > > > > > > >> > >> is a
> > > > > > > > > > >> > >> > >> > little
> > > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > that maybe we can
> > use
> > > > the
> > > > > > > > existing
> > > > > > > > > > >> > request
> > > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > very detailed so
> we
> > > can
> > > > > > > probably
> > > > > > > > > see
> > > > > > > > > > >> the
> > > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > something like
> > > > > (total_time -
> > > > > > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I agree with
> > Guozhang
> > > > that
> > > > > > > when
> > > > > > > > a
> > > > > > > > > > >> user is
> > > > > > > > > > >> > >> > >> throttled
> > > > > > > > > > >> > >> > >> > > it
> > > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > need to see if
> > > anything
> > > > > has
> > > > > > > went
> > > > > > > > > > wrong
> > > > > > > > > > >> > >> first,
> > > > > > > > > > >> > >> > >> and
> > > > > > > > > > >> > >> > >> > if
> > > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > behaving and just
> > need
> > > > > more
> > > > > > > > > > >> resources, we
> > > > > > > > > > >> > >> will
> > > > > > > > > > >> > >> > >> have
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for them. It is
> true
> > > > that
> > > > > > > > > > >> pre-allocating
> > > > > > > > > > >> > >> CPU
> > > > > > > > > > >> > >> > >> time
> > > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > users is
> difficult.
> > So
> > > > in
> > > > > > > > practice
> > > > > > > > > > it
> > > > > > > > > > >> > would
> > > > > > > > > > >> > >> > >> > probably
> > > > > > > > > > >> > >> > >> > > be
> > > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a relative high
> > > > protective
> > > > > > CPU
> > > > > > > > > time
> > > > > > > > > > >> quota
> > > > > > > > > > >> > >> for
> > > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for some
> individual
> > > > > clients
> > > > > > on
> > > > > > > > > > demand.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket)
> > Qin
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20,
> 2017
> > > at
> > > > > 5:48
> > > > > > > PM,
> > > > > > > > > > >> Guozhang
> > > > > > > > > > >> > >> > Wang <
> > > > > > > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > This is a great
> > > > > proposal,
> > > > > > > glad
> > > > > > > > > to
> > > > > > > > > > >> see
> > > > > > > > > > >> > it
> > > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am inclined to
> > the
> > > > CPU
> > > > > > > > > > >> throttling, or
> > > > > > > > > > >> > >> more
> > > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio instead of
> > the
> > > > > > request
> > > > > > > > > rate
> > > > > > > > > > >> > >> throttling
> > > > > > > > > > >> > >> > >> as
> > > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > summed my
> > rationales
> > > > > > above,
> > > > > > > > and
> > > > > > > > > > one
> > > > > > > > > > >> > >> thing to
> > > > > > > > > > >> > >> > >> add
> > > > > > > > > > >> > >> > >> > > here
> > > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > has a good
> support
> > > for
> > > > > > both
> > > > > > > > > > >> "protecting
> > > > > > > > > > >> > >> > >> against
> > > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a
> > cluster
> > > > for
> > > > > > > > > > >> multi-tenancy
> > > > > > > > > > >> > >> > usage":
> > > > > > > > > > >> > >> > >> > when
> > > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > explain this to
> > the
> > > > end
> > > > > > > > users, I
> > > > > > > > > > >> find
> > > > > > > > > > >> > it
> > > > > > > > > > >> > >> > >> actually
> > > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > request rate
> since
> > > as
> > > > > > > > mentioned
> > > > > > > > > > >> above,
> > > > > > > > > > >> > >> > >> different
> > > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > different
> "cost",
> > > and
> > > > > > Kafka
> > > > > > > > > today
> > > > > > > > > > >> > already
> > > > > > > > > > >> > >> > have
> > > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > (produce, fetch,
> > > > admin,
> > > > > > > > > metadata,
> > > > > > > > > > >> etc),
> > > > > > > > > > >> > >> > >> because
> > > > > > > > > > >> > >> > >> > of
> > > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttling may
> not
> > > be
> > > > as
> > > > > > > > > effective
> > > > > > > > > > >> > >> unless it
> > > > > > > > > > >> > >> > >> is
> > > > > > > > > > >> > >> > >> > set
> > > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > Regarding to
> user
> > > > > > reactions
> > > > > > > > when
> > > > > > > > > > >> they
> > > > > > > > > > >> > are
> > > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > case-by-case,
> and
> > > need
> > > > > to
> > > > > > be
> > > > > > > > > > >> > discovered /
> > > > > > > > > > >> > >> > >> guided
> > > > > > > > > > >> > >> > >> > by
> > > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > metrics. So in
> > other
> > > > > words
> > > > > > > > users
> > > > > > > > > > >> would
> > > > > > > > > > >> > >> not
> > > > > > > > > > >> > >> > >> expect
> > > > > > > > > > >> > >> > >> > > to
> > > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > information by
> > > simply
> > > > > > being
> > > > > > > > told
> > > > > > > > > > >> "hey,
> > > > > > > > > > >> > >> you
> > > > > > > > > > >> > >> > are
> > > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > what throttling
> > > does;
> > > > > they
> > > > > > > > need
> > > > > > > > > to
> > > > > > > > > > >> > take a
> > > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttled
> probably
> > > > > because
> > > > > > > of
> > > > > > > > > ..",
> > > > > > > > > > >> > which
> > > > > > > > > > >> > >> is
> > > > > > > > > > >> > >> > by
> > > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > values: e.g.
> > whether
> > > > I'm
> > > > > > > > > > bombarding
> > > > > > > > > > >> the
> > > > > > > > > > >> > >> > >> brokers
> > > > > > > > > > >> > >> > >> > > with
> > > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > > >>
> > > > > > > > > > > ...
> > > > > > > > > > >
> > > > > > > > > > > [Message clipped]
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> *Todd Palino*
> Staff Site Reliability Engineer
> Data Infrastructure Streaming
>
>
>
> linkedin.com/in/toddpalino
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Todd Palino <tp...@gmail.com>.
I’ve been following this one on and off, and overall it sounds good to me.

- The SSL question is a good one. However, that type of overhead should be
proportional to the bytes rate, so I think that a bytes rate quota would
still be a suitable way to address it.

- I think it’s better to make the quota percentage of total thread pool
capacity, and not percentage of an individual thread. That way you don’t
have to adjust it when you adjust thread counts (tuning, hardware changes,
etc.)


-Todd



On Tue, Mar 7, 2017 at 2:38 PM, Becket Qin <be...@gmail.com> wrote:

> I see. Good point about SSL.
>
> I just asked Todd to take a look.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Jiangjie,
> >
> > Yes, I agree that byte rate already protects the network threads
> > indirectly. I am not sure if byte rate fully captures the CPU overhead in
> > network due to SSL. So, at the high level, we can use request time limit
> to
> > protect CPU and use byte rate to protect storage and network.
> >
> > Also, do you think you can get Todd to comment on this KIP?
> >
> > Thanks,
> >
> > Jun
> >
> > On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <be...@gmail.com>
> wrote:
> >
> > > Hi Rajini/Jun,
> > >
> > > The percentage based reasoning sounds good.
> > > One thing I am wondering is that if we assume the network thread are
> just
> > > doing the network IO, can we say bytes rate quota is already sort of
> > > network threads quota?
> > > If we take network threads into the consideration here, would that be
> > > somewhat overlapping with the bytes rate quota?
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <
> rajinisivaram@gmail.com
> > >
> > > wrote:
> > >
> > > > Jun,
> > > >
> > > > Thank you for the explanation, I hadn't realized you meant percentage
> > of
> > > > the total thread pool. If everyone is OK with Jun's suggestion, I
> will
> > > > update the KIP.
> > > >
> > > > Thanks,
> > > >
> > > > Rajini
> > > >
> > > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Hi, Rajini,
> > > > >
> > > > > Let's take your example. Let's say a user sets the limit to 50%. I
> am
> > > not
> > > > > sure if it's better to apply the same percentage separately to
> > network
> > > > and
> > > > > io thread pool. For example, for produce requests, most of the time
> > > will
> > > > be
> > > > > spent in the io threads whereas for fetch requests, most of the
> time
> > > will
> > > > > be in the network threads. So, using the same percentage in both
> > thread
> > > > > pools means one of the pools' resource will be over allocated.
> > > > >
> > > > > An alternative way is to simply model network and io thread pool
> > > > together.
> > > > > If you get 10 io threads and 5 network threads, you get 1500%
> request
> > > > > processing power. A 50% limit means a total of 750% processing
> power.
> > > We
> > > > > just add up the time a user request spent in either network or io
> > > thread.
> > > > > If that total exceeds 750% (doesn't matter whether it's spent more
> in
> > > > > network or io thread), the request will be throttled. This seems
> more
> > > > > general and is not sensitive to the current implementation detail
> of
> > > > having
> > > > > a separate network and io thread pool. In the future, if the
> > threading
> > > > > model changes, the same concept of quota can still be applied. For
> > now,
> > > > > since it's a bit tricky to add the delay logic in the network
> thread
> > > > pool,
> > > > > we could probably just do the delaying only in the io threads as
> you
> > > > > suggested earlier.
> > > > >
> > > > > There is still the orthogonal question of whether a quota of 50% is
> > out
> > > > of
> > > > > 100% or 100% * #total processing threads. My feeling is that the
> > latter
> > > > is
> > > > > slightly better based on my explanation earlier. The way to
> describe
> > > this
> > > > > quota to the users can be "share of elapsed request processing time
> > on
> > > a
> > > > > single CPU" (similar to top).
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > > rajinisivaram@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Jun,
> > > > > >
> > > > > > Agree about the two scenarios.
> > > > > >
> > > > > > But still not sure about a single quota covering both network
> > threads
> > > > and
> > > > > > I/O threads with per-thread quota. If there are 10 I/O threads
> and
> > 5
> > > > > > network threads and I want to assign half the quota to userA, the
> > > quota
> > > > > > would be 750%. I imagine, internally, we would convert this to
> 500%
> > > for
> > > > > I/O
> > > > > > and 250% for network threads to allocate 50% of each pool.
> > > > > >
> > > > > > A couple of scenarios:
> > > > > >
> > > > > > 1. Admin adds 1 extra network thread. To retain 50%, admin needs
> to
> > > now
> > > > > > allocate 800% for each user. Or increase the quota for a few
> users.
> > > To
> > > > > me,
> > > > > > it feels like admin needs to convert 50% to 800% and Kafka
> > internally
> > > > > needs
> > > > > > to convert 800% to (500%, 300%). Everyone using just 50% feels a
> > lot
> > > > > > simpler.
> > > > > >
> > > > > > 2. We decide to add some other thread to this list. Admin needs
> to
> > > know
> > > > > > exactly how many threads form the maximum quota. And we can be
> > > changing
> > > > > > this between broker versions as we add more to the list. Again a
> > > single
> > > > > > overall percent would be a lot simpler.
> > > > > >
> > > > > > There were others who were unconvinced by a single percent from
> the
> > > > > initial
> > > > > > proposal and were happier with thread units similar to CPU units,
> > so
> > > I
> > > > am
> > > > > > ok with going with per-thread quotas (as units or percent). Just
> > not
> > > > sure
> > > > > > it makes it easier for admin in all cases.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Rajini
> > > > > >
> > > > > >
> > > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Hi, Rajini,
> > > > > > >
> > > > > > > Consider modeling as n * 100% unit. For 2), the question is
> > what's
> > > > > > causing
> > > > > > > the I/O threads to be saturated. It's unlikely that all users'
> > > > > > utilization
> > > > > > > have increased at the same. A more likely case is that a few
> > > isolated
> > > > > > > users' utilization have increased. If so, after increasing the
> > > number
> > > > > of
> > > > > > > threads, the admin just needs to adjust the quota for a few
> > > isolated
> > > > > > users,
> > > > > > > which is expected and is less work.
> > > > > > >
> > > > > > > Consider modeling as 1 * 100% unit. For 1), all users' quota
> need
> > > to
> > > > be
> > > > > > > adjusted, which is unexpected and is more work.
> > > > > > >
> > > > > > > So, to me, the n * 100% model seems more convenient.
> > > > > > >
> > > > > > > As for future extension to cover network thread utilization, I
> > was
> > > > > > thinking
> > > > > > > that one way is to simply model the capacity as (n + m) * 100%
> > > unit,
> > > > > > where
> > > > > > > n and m are the number of network and i/o threads,
> respectively.
> > > > Then,
> > > > > > for
> > > > > > > each user, we can just add up the utilization in the network
> and
> > > the
> > > > > i/o
> > > > > > > thread. If we do this, we don't need a new type of quota.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > > > rajinisivaram@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Jun,
> > > > > > > >
> > > > > > > > If we use request.percentage as the percentage used in a
> single
> > > I/O
> > > > > > > thread,
> > > > > > > > the total percentage being allocated will be num.io.threads *
> > 100
> > > > for
> > > > > > I/O
> > > > > > > > threads and num.network.threads * 100 for network threads. A
> > > single
> > > > > > quota
> > > > > > > > covering the two as a percentage wouldn't quite work if you
> > want
> > > to
> > > > > > > > allocate the same proportion in both cases. If we want to
> treat
> > > > > threads
> > > > > > > as
> > > > > > > > separate units, won't we need two quota configurations
> > regardless
> > > > of
> > > > > > > > whether we use units or percentage? Perhaps I misunderstood
> > your
> > > > > > > > suggestion.
> > > > > > > >
> > > > > > > > I think there are two cases:
> > > > > > > >
> > > > > > > >    1. The use case that you mentioned where an admin is
> adding
> > > more
> > > > > > users
> > > > > > > >    and decides to add more I/O threads and expects to find
> free
> > > > quota
> > > > > > to
> > > > > > > >    allocate for new users.
> > > > > > > >    2. Admin adds more I/O threads because the I/O threads are
> > > > > saturated
> > > > > > > and
> > > > > > > >    there are cores available to allocate, even though the
> > number
> > > or
> > > > > > > >    users/clients hasn't changed.
> > > > > > > >
> > > > > > > > If we allocated treated I/O threads as a single unit of 100%,
> > all
> > > > > user
> > > > > > > > quotas need to be reallocated for 1). If we allocated I/O
> > threads
> > > > as
> > > > > n
> > > > > > > > units with n*100%, all user quotas need to be reallocated for
> > 2),
> > > > > > > otherwise
> > > > > > > > some of the new threads may just not be used. Either way it
> > > should
> > > > be
> > > > > > > easy
> > > > > > > > to write a script to decrease/increase quotas by a multiple
> for
> > > all
> > > > > > > users.
> > > > > > > >
> > > > > > > > So it really boils down to which quota unit is most intuitive
> > in
> > > > > terms
> > > > > > of
> > > > > > > > configuration. And from the discussion so far, it feels like
> > > > opinion
> > > > > is
> > > > > > > > divided on whether quotas should be carved out of an absolute
> > > 100%
> > > > > (or
> > > > > > 1
> > > > > > > > unit) or be relative to the number of threads (n*100% or n
> > > units).
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io>
> > > wrote:
> > > > > > > >
> > > > > > > > > Another way to express an absolute limit is to use
> > > > > > request.percentage,
> > > > > > > > but
> > > > > > > > > treat it as the percentage used in a single request
> handling
> > > > > thread.
> > > > > > > For
> > > > > > > > > now, the request handling threads can be just the io
> threads.
> > > In
> > > > > the
> > > > > > > > > future, they can cover the network threads as well. This is
> > > > similar
> > > > > > to
> > > > > > > > how
> > > > > > > > > top reports CPU usage and may be a bit easier for people to
> > > > > > understand.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <
> jun@confluent.io>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Jay,
> > > > > > > > > >
> > > > > > > > > > 2. Regarding request.unit vs request.percentage. I
> started
> > > with
> > > > > > > > > > request.percentage too. The reasoning for request.unit is
> > the
> > > > > > > > following.
> > > > > > > > > > Suppose that the capacity has been reached on a broker
> and
> > > the
> > > > > > admin
> > > > > > > > > needs
> > > > > > > > > > to add a new user. A simple way to increase the capacity
> is
> > > to
> > > > > > > increase
> > > > > > > > > the
> > > > > > > > > > number of io threads, assuming there are still enough
> > cores.
> > > If
> > > > > the
> > > > > > > > limit
> > > > > > > > > > is based on percentage, the additional capacity
> > automatically
> > > > > gets
> > > > > > > > > > distributed to existing users and we haven't really
> carved
> > > out
> > > > > any
> > > > > > > > > > additional resource for the new user. Now, is it easy
> for a
> > > > user
> > > > > to
> > > > > > > > > reason
> > > > > > > > > > about 0.1 unit vs 10%. My feeling is that both are hard
> and
> > > > have
> > > > > to
> > > > > > > be
> > > > > > > > > > configured empirically. Not sure if percentage is
> obviously
> > > > > easier
> > > > > > to
> > > > > > > > > > reason about.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <
> > jay@confluent.io
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > >> A couple of quick points:
> > > > > > > > > >>
> > > > > > > > > >> 1. Even though the implementation of this quota is only
> > > using
> > > > io
> > > > > > > > thread
> > > > > > > > > >> time, i think we should call it something like
> > > "request-time".
> > > > > > This
> > > > > > > > will
> > > > > > > > > >> give us flexibility to improve the implementation to
> cover
> > > > > network
> > > > > > > > > threads
> > > > > > > > > >> in the future and will avoid exposing internal details
> > like
> > > > our
> > > > > > > thread
> > > > > > > > > >> pools on the server.
> > > > > > > > > >>
> > > > > > > > > >> 2. Jun/Roger, I get what you are trying to fix but the
> > idea
> > > of
> > > > > > > > > >> thread/units
> > > > > > > > > >> is super unintuitive as a user-facing knob. I had to
> read
> > > the
> > > > > KIP
> > > > > > > like
> > > > > > > > > >> eight times to understand this. I'm not sure that your
> > point
> > > > > that
> > > > > > > > > >> increasing the number of threads is a problem with a
> > > > > > > percentage-based
> > > > > > > > > >> value, it really depends on whether the user thinks
> about
> > > the
> > > > > > > > > "percentage
> > > > > > > > > >> of request processing time" or "thread units". If they
> > think
> > > > "I
> > > > > > have
> > > > > > > > > >> allocated 10% of my request processing time to user x"
> > then
> > > it
> > > > > is
> > > > > > a
> > > > > > > > bug
> > > > > > > > > >> that increasing the thread count decreases that percent
> as
> > > it
> > > > > does
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > >> current proposal. As a practical matter I think the only
> > way
> > > > to
> > > > > > > > actually
> > > > > > > > > >> reason about this is as a percent---I just don't believe
> > > > people
> > > > > > are
> > > > > > > > > going
> > > > > > > > > >> to think, "ah, 4.3 thread units, that is the right
> > amount!".
> > > > > > > Instead I
> > > > > > > > > >> think they have to understand this thread unit concept,
> > > figure
> > > > > out
> > > > > > > > what
> > > > > > > > > >> they have set in number of threads, compute a percent
> and
> > > then
> > > > > > come
> > > > > > > up
> > > > > > > > > >> with
> > > > > > > > > >> the number of thread units, and these will all be wrong
> if
> > > > that
> > > > > > > thread
> > > > > > > > > >> count changes. I also think this ties us to throttling
> the
> > > I/O
> > > > > > > thread
> > > > > > > > > >> pool,
> > > > > > > > > >> which may not be where we want to end up.
> > > > > > > > > >>
> > > > > > > > > >> 3. For what it's worth I do think having a single
> > > throttle_ms
> > > > > > field
> > > > > > > in
> > > > > > > > > all
> > > > > > > > > >> the responses that combines all throttling from all
> quotas
> > > is
> > > > > > > probably
> > > > > > > > > the
> > > > > > > > > >> simplest. There could be a use case for having separate
> > > fields
> > > > > for
> > > > > > > > each,
> > > > > > > > > >> but I think that is actually harder to use/monitor in
> the
> > > > common
> > > > > > > case
> > > > > > > > so
> > > > > > > > > >> unless someone has a use case I think just one should be
> > > fine.
> > > > > > > > > >>
> > > > > > > > > >> -Jay
> > > > > > > > > >>
> > > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > > >> wrote:
> > > > > > > > > >>
> > > > > > > > > >> > I have updated the KIP based on the discussions so
> far.
> > > > > > > > > >> >
> > > > > > > > > >> >
> > > > > > > > > >> > Regards,
> > > > > > > > > >> >
> > > > > > > > > >> > Rajini
> > > > > > > > > >> >
> > > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > > >> > wrote:
> > > > > > > > > >> >
> > > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Ismael #1. It makes sense not to throttle
> inter-broker
> > > > > > requests
> > > > > > > > like
> > > > > > > > > >> > > LeaderAndIsr etc. The simplest way to ensure that
> > > clients
> > > > > > cannot
> > > > > > > > use
> > > > > > > > > >> > these
> > > > > > > > > >> > > requests to bypass quotas for DoS attacks is to
> ensure
> > > > that
> > > > > > ACLs
> > > > > > > > > >> prevent
> > > > > > > > > >> > > clients from using these requests and unauthorized
> > > > requests
> > > > > > are
> > > > > > > > > >> included
> > > > > > > > > >> > > towards quotas.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking that these quotas
> > can
> > > > > > return
> > > > > > > a
> > > > > > > > > >> > separate
> > > > > > > > > >> > > throttle time, and all utilization based quotas
> could
> > > use
> > > > > the
> > > > > > > same
> > > > > > > > > >> field
> > > > > > > > > >> > > (we won't add another one for network thread
> > utilization
> > > > for
> > > > > > > > > >> instance).
> > > > > > > > > >> > But
> > > > > > > > > >> > > perhaps it makes sense to keep byte rate quotas
> > separate
> > > > in
> > > > > > > > > >> produce/fetch
> > > > > > > > > >> > > responses to provide separate metrics? Agree with
> > Ismael
> > > > > that
> > > > > > > the
> > > > > > > > > >> name of
> > > > > > > > > >> > > the existing field should be changed if we have two.
> > > Happy
> > > > > to
> > > > > > > > switch
> > > > > > > > > >> to a
> > > > > > > > > >> > > single combined throttle time if that is sufficient.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot
> > > separated
> > > > > > name
> > > > > > > > for
> > > > > > > > > >> new
> > > > > > > > > >> > > property. Replication quotas use dot separated, so
> it
> > > will
> > > > > be
> > > > > > > > > >> consistent
> > > > > > > > > >> > > with all properties except byte rate quotas.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Radai: #1 Request processing time rather than
> request
> > > rate
> > > > > > were
> > > > > > > > > chosen
> > > > > > > > > >> > > because the time per request can vary significantly
> > > > between
> > > > > > > > requests
> > > > > > > > > >> as
> > > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > > >> > > #2 Two separate quotas for heartbeats/regular
> requests
> > > > feel
> > > > > > like
> > > > > > > > > more
> > > > > > > > > >> > > configuration and more metrics. Since most users
> would
> > > set
> > > > > > > quotas
> > > > > > > > > >> higher
> > > > > > > > > >> > > than the expected usage and quotas are more of a
> > safety
> > > > > net, a
> > > > > > > > > single
> > > > > > > > > >> > quota
> > > > > > > > > >> > > should work in most cases.
> > > > > > > > > >> > >  #3 The number of requests in purgatory is limited
> by
> > > the
> > > > > > number
> > > > > > > > of
> > > > > > > > > >> > active
> > > > > > > > > >> > > connections since only one request per connection
> will
> > > be
> > > > > > > > throttled
> > > > > > > > > >> at a
> > > > > > > > > >> > > time.
> > > > > > > > > >> > > #4 As with byte rate quotas, to use the full
> allocated
> > > > > quotas,
> > > > > > > > > >> > > clients/users would need to use partitions that are
> > > > > > distributed
> > > > > > > > > across
> > > > > > > > > >> > the
> > > > > > > > > >> > > cluster. The alternative of using cluster-wide
> quotas
> > > > > instead
> > > > > > of
> > > > > > > > > >> > per-broker
> > > > > > > > > >> > > quotas would be far too complex to implement.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Dong : We currently have two ClientQuotaManagers for
> > > quota
> > > > > > types
> > > > > > > > > Fetch
> > > > > > > > > >> > and
> > > > > > > > > >> > > Produce. A new one will be added for IOThread, which
> > > > manages
> > > > > > > > quotas
> > > > > > > > > >> for
> > > > > > > > > >> > I/O
> > > > > > > > > >> > > thread utilization. This will not update the Fetch
> or
> > > > > Produce
> > > > > > > > > >> queue-size,
> > > > > > > > > >> > > but will have a separate metric for the
> queue-size.  I
> > > > > wasn't
> > > > > > > > > >> planning to
> > > > > > > > > >> > > add any additional metrics apart from the equivalent
> > > ones
> > > > > for
> > > > > > > > > existing
> > > > > > > > > >> > > quotas as part of this KIP. Ratio of byte-rate to
> I/O
> > > > thread
> > > > > > > > > >> utilization
> > > > > > > > > >> > > could be slightly misleading since it depends on the
> > > > > sequence
> > > > > > of
> > > > > > > > > >> > requests.
> > > > > > > > > >> > > But we can look into more metrics after the KIP is
> > > > > implemented
> > > > > > > if
> > > > > > > > > >> > required.
> > > > > > > > > >> > >
> > > > > > > > > >> > > I think we need to limit the maximum delay since all
> > > > > requests
> > > > > > > are
> > > > > > > > > >> > > throttled. If a client has a quota of 0.001 units
> and
> > a
> > > > > single
> > > > > > > > > request
> > > > > > > > > >> > used
> > > > > > > > > >> > > 50ms, we don't want to delay all requests from the
> > > client
> > > > by
> > > > > > 50
> > > > > > > > > >> seconds,
> > > > > > > > > >> > > throwing the client out of all its consumer groups.
> > The
> > > > > issue
> > > > > > is
> > > > > > > > > only
> > > > > > > > > >> if
> > > > > > > > > >> > a
> > > > > > > > > >> > > user is allocated a quota that is insufficient to
> > > process
> > > > > one
> > > > > > > > large
> > > > > > > > > >> > > request. The expectation is that the units allocated
> > per
> > > > > user
> > > > > > > will
> > > > > > > > > be
> > > > > > > > > >> > much
> > > > > > > > > >> > > higher than the time taken to process one request
> and
> > > the
> > > > > > limit
> > > > > > > > > should
> > > > > > > > > >> > > seldom be applied. Agree this needs proper
> > > documentation.
> > > > > > > > > >> > >
> > > > > > > > > >> > > Regards,
> > > > > > > > > >> > >
> > > > > > > > > >> > > Rajini
> > > > > > > > > >> > >
> > > > > > > > > >> > >
> > > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > > >> > wrote:
> > > > > > > > > >> > >
> > > > > > > > > >> > >> @jun: i wasnt concerned about tying up a request
> > > > processing
> > > > > > > > thread,
> > > > > > > > > >> but
> > > > > > > > > >> > >> IIUC the code does still read the entire request
> out,
> > > > which
> > > > > > > might
> > > > > > > > > >> add-up
> > > > > > > > > >> > >> to
> > > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > > >> > >>
> > > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> > > > > > > lindong28@gmail.com>
> > > > > > > > > >> wrote:
> > > > > > > > > >> > >>
> > > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > > >> > >> >
> > > > > > > > > >> > >> > The current KIP says that the maximum delay will
> be
> > > > > reduced
> > > > > > > to
> > > > > > > > > >> window
> > > > > > > > > >> > >> size
> > > > > > > > > >> > >> > if it is larger than the window size. I have a
> > > concern
> > > > > with
> > > > > > > > this:
> > > > > > > > > >> > >> >
> > > > > > > > > >> > >> > 1) This essentially means that the user is
> allowed
> > to
> > > > > > exceed
> > > > > > > > > their
> > > > > > > > > >> > quota
> > > > > > > > > >> > >> > over a long period of time. Can you provide an
> > upper
> > > > > bound
> > > > > > on
> > > > > > > > > this
> > > > > > > > > >> > >> > deviation?
> > > > > > > > > >> > >> >
> > > > > > > > > >> > >> > 2) What is the motivation for cap the maximum
> delay
> > > by
> > > > > the
> > > > > > > > window
> > > > > > > > > >> > size?
> > > > > > > > > >> > >> I
> > > > > > > > > >> > >> > am wondering if there is better alternative to
> > > address
> > > > > the
> > > > > > > > > problem.
> > > > > > > > > >> > >> >
> > > > > > > > > >> > >> > 3) It means that the existing metric-related
> config
> > > > will
> > > > > > > have a
> > > > > > > > > >> more
> > > > > > > > > >> > >> > directly impact on the mechanism of this
> > > > > > io-thread-unit-based
> > > > > > > > > >> quota.
> > > > > > > > > >> > The
> > > > > > > > > >> > >> > may be an important change depending on the
> answer
> > to
> > > > 1)
> > > > > > > above.
> > > > > > > > > We
> > > > > > > > > >> > >> probably
> > > > > > > > > >> > >> > need to document this more explicitly.
> > > > > > > > > >> > >> >
> > > > > > > > > >> > >> > Dong
> > > > > > > > > >> > >> >
> > > > > > > > > >> > >> >
> > > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> > > > > > > > lindong28@gmail.com>
> > > > > > > > > >> > wrote:
> > > > > > > > > >> > >> >
> > > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > > >> > >> > >
> > > > > > > > > >> > >> > > Yeah you are right. I thought it wasn't because
> > at
> > > > > > LinkedIn
> > > > > > > > it
> > > > > > > > > >> will
> > > > > > > > > >> > be
> > > > > > > > > >> > >> > too
> > > > > > > > > >> > >> > > much pressure on inGraph to expose those
> > > per-clientId
> > > > > > > metrics
> > > > > > > > > so
> > > > > > > > > >> we
> > > > > > > > > >> > >> ended
> > > > > > > > > >> > >> > > up printing them periodically to local log.
> Never
> > > > mind
> > > > > if
> > > > > > > it
> > > > > > > > is
> > > > > > > > > >> not
> > > > > > > > > >> > a
> > > > > > > > > >> > >> > > general problem.
> > > > > > > > > >> > >> > >
> > > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > > >> > >> > >
> > > > > > > > > >> > >> > > - I agree with Jay that we probably don't want
> to
> > > > add a
> > > > > > new
> > > > > > > > > field
> > > > > > > > > >> > for
> > > > > > > > > >> > >> > > every quota ProduceResponse or FetchResponse.
> Is
> > > > there
> > > > > > any
> > > > > > > > > >> use-case
> > > > > > > > > >> > >> for
> > > > > > > > > >> > >> > > having separate throttle-time fields for
> > > > > byte-rate-quota
> > > > > > > and
> > > > > > > > > >> > >> > > io-thread-unit-quota? You probably need to
> > document
> > > > > this
> > > > > > as
> > > > > > > > > >> > interface
> > > > > > > > > >> > >> > > change if you plan to add new field in any
> > request.
> > > > > > > > > >> > >> > >
> > > > > > > > > >> > >> > > - I don't think IOThread belongs to quotaType.
> > The
> > > > > > existing
> > > > > > > > > quota
> > > > > > > > > >> > >> types
> > > > > > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > > > > > n/FollowerReplication)
> > > > > > > > > >> identify
> > > > > > > > > >> > >> the
> > > > > > > > > >> > >> > > type of request that are throttled, not the
> quota
> > > > > > mechanism
> > > > > > > > > that
> > > > > > > > > >> is
> > > > > > > > > >> > >> > applied.
> > > > > > > > > >> > >> > >
> > > > > > > > > >> > >> > > - If a request is throttled due to this
> > > > > > > io-thread-unit-based
> > > > > > > > > >> quota,
> > > > > > > > > >> > is
> > > > > > > > > >> > >> > the
> > > > > > > > > >> > >> > > existing queue-size metric in
> ClientQuotaManager
> > > > > > > incremented?
> > > > > > > > > >> > >> > >
> > > > > > > > > >> > >> > > - In the interest of providing guide line for
> > admin
> > > > to
> > > > > > > decide
> > > > > > > > > >> > >> > > io-thread-unit-based quota and for user to
> > > understand
> > > > > its
> > > > > > > > > impact
> > > > > > > > > >> on
> > > > > > > > > >> > >> their
> > > > > > > > > >> > >> > > traffic, would it be useful to have a metric
> that
> > > > shows
> > > > > > the
> > > > > > > > > >> overall
> > > > > > > > > >> > >> > > byte-rate per io-thread-unit? Can we also show
> > > this a
> > > > > > > > > >> per-clientId
> > > > > > > > > >> > >> > metric?
> > > > > > > > > >> > >> > >
> > > > > > > > > >> > >> > > Thanks,
> > > > > > > > > >> > >> > > Dong
> > > > > > > > > >> > >> > >
> > > > > > > > > >> > >> > >
> > > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <
> > > > > > jun@confluent.io
> > > > > > > >
> > > > > > > > > >> wrote:
> > > > > > > > > >> > >> > >
> > > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > > >> > >> > >>
> > > > > > > > > >> > >> > >> For #3, typically, an admin won't configure
> more
> > > io
> > > > > > > threads
> > > > > > > > > than
> > > > > > > > > >> > CPU
> > > > > > > > > >> > >> > >> cores,
> > > > > > > > > >> > >> > >> but it's possible for an admin to start with
> > fewer
> > > > io
> > > > > > > > threads
> > > > > > > > > >> than
> > > > > > > > > >> > >> cores
> > > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > > >> > >> > >>
> > > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > > >> > >> > >>
> > > > > > > > > >> > >> > >> I think the throttleTime sensor on the broker
> > > tells
> > > > > the
> > > > > > > > admin
> > > > > > > > > >> > >> whether a
> > > > > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > > > > >> > >> > >>
> > > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > > >> > >> > >>
> > > > > > > > > >> > >> > >> The reasoning for delaying the throttled
> > requests
> > > on
> > > > > the
> > > > > > > > > broker
> > > > > > > > > >> > >> instead
> > > > > > > > > >> > >> > of
> > > > > > > > > >> > >> > >> returning an error immediately is that the
> > latter
> > > > has
> > > > > no
> > > > > > > way
> > > > > > > > > to
> > > > > > > > > >> > >> prevent
> > > > > > > > > >> > >> > >> the
> > > > > > > > > >> > >> > >> client from retrying immediately, which will
> > make
> > > > > things
> > > > > > > > > worse.
> > > > > > > > > >> The
> > > > > > > > > >> > >> > >> delaying logic is based off a delay queue. A
> > > > separate
> > > > > > > > > expiration
> > > > > > > > > >> > >> thread
> > > > > > > > > >> > >> > >> just waits on the next to be expired request.
> > So,
> > > it
> > > > > > > doesn't
> > > > > > > > > tie
> > > > > > > > > >> > up a
> > > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > > >> > >> > >>
> > > > > > > > > >> > >> > >> Thanks,
> > > > > > > > > >> > >> > >>
> > > > > > > > > >> > >> > >> Jun
> > > > > > > > > >> > >> > >>
> > > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> > > > > > > > > ismael@juma.me.uk
> > > > > > > > > >> >
> > > > > > > > > >> > >> wrote:
> > > > > > > > > >> > >> > >>
> > > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > > >> > >> > >> >
> > > > > > > > > >> > >> > >> > Regarding 1, I definitely like the
> simplicity
> > of
> > > > > > > keeping a
> > > > > > > > > >> single
> > > > > > > > > >> > >> > >> throttle
> > > > > > > > > >> > >> > >> > time field in the response. The downside is
> > that
> > > > the
> > > > > > > > client
> > > > > > > > > >> > metrics
> > > > > > > > > >> > >> > >> will be
> > > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > > >> > >> > >> >
> > > > > > > > > >> > >> > >> > Regarding 3, we have
> > > `leader.imbalance.per.broker.
> > > > > > > > > percentage`
> > > > > > > > > >> > and
> > > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > > > > > >> > >> > >> >
> > > > > > > > > >> > >> > >> > Ismael
> > > > > > > > > >> > >> > >> >
> > > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> > > > > > > > > jay@confluent.io>
> > > > > > > > > >> > >> wrote:
> > > > > > > > > >> > >> > >> >
> > > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > > >> > >> > >> > >
> > > > > > > > > >> > >> > >> > >    1. Isn't it the case that the
> throttling
> > > time
> > > > > > > > response
> > > > > > > > > >> field
> > > > > > > > > >> > >> > should
> > > > > > > > > >> > >> > >> > have
> > > > > > > > > >> > >> > >> > >    the total time your request was
> throttled
> > > > > > > > irrespective
> > > > > > > > > of
> > > > > > > > > >> > the
> > > > > > > > > >> > >> > >> quotas
> > > > > > > > > >> > >> > >> > > that
> > > > > > > > > >> > >> > >> > >    caused that. Limiting it to byte rate
> > quota
> > > > > > doesn't
> > > > > > > > > make
> > > > > > > > > >> > >> sense,
> > > > > > > > > >> > >> > >> but I
> > > > > > > > > >> > >> > >> > > also
> > > > > > > > > >> > >> > >> > >    I don't think we want to end up adding
> > new
> > > > > fields
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > >> > >> response
> > > > > > > > > >> > >> > >> for
> > > > > > > > > >> > >> > >> > > every
> > > > > > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > > > > > >> > >> > >> > >    2. I don't think we should make this
> > quota
> > > > > > > > specifically
> > > > > > > > > >> > about
> > > > > > > > > >> > >> io
> > > > > > > > > >> > >> > >> > >    threads. Once we introduce these quotas
> > > > people
> > > > > > set
> > > > > > > > them
> > > > > > > > > >> and
> > > > > > > > > >> > >> > expect
> > > > > > > > > >> > >> > >> > them
> > > > > > > > > >> > >> > >> > > to
> > > > > > > > > >> > >> > >> > >    be enforced (and if they aren't it may
> > > cause
> > > > an
> > > > > > > > > outage).
> > > > > > > > > >> As
> > > > > > > > > >> > a
> > > > > > > > > >> > >> > >> result
> > > > > > > > > >> > >> > >> > > they
> > > > > > > > > >> > >> > >> > >    are a bit more sensitive than normal
> > > > configs, I
> > > > > > > > think.
> > > > > > > > > >> The
> > > > > > > > > >> > >> > current
> > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > >> > >> > >> > >    pools seem like something of an
> > > > implementation
> > > > > > > detail
> > > > > > > > > and
> > > > > > > > > >> > not
> > > > > > > > > >> > >> the
> > > > > > > > > >> > >> > >> > level
> > > > > > > > > >> > >> > >> > > the
> > > > > > > > > >> > >> > >> > >    user-facing quotas should be involved
> > > with. I
> > > > > > think
> > > > > > > > it
> > > > > > > > > >> might
> > > > > > > > > >> > >> be
> > > > > > > > > >> > >> > >> better
> > > > > > > > > >> > >> > >> > > to
> > > > > > > > > >> > >> > >> > >    make this a general request-time
> throttle
> > > > with
> > > > > no
> > > > > > > > > >> mention in
> > > > > > > > > >> > >> the
> > > > > > > > > >> > >> > >> > naming
> > > > > > > > > >> > >> > >> > >    about I/O threads and simply
> acknowledge
> > > the
> > > > > > > current
> > > > > > > > > >> > >> limitation
> > > > > > > > > >> > >> > >> (which
> > > > > > > > > >> > >> > >> > > we
> > > > > > > > > >> > >> > >> > >    may someday fix) in the docs that this
> > > covers
> > > > > > only
> > > > > > > > the
> > > > > > > > > >> time
> > > > > > > > > >> > >> after
> > > > > > > > > >> > >> > >> the
> > > > > > > > > >> > >> > >> > >    thread is read off the network.
> > > > > > > > > >> > >> > >> > >    3. As such I think the right interface
> to
> > > the
> > > > > > user
> > > > > > > > > would
> > > > > > > > > >> be
> > > > > > > > > >> > >> > >> something
> > > > > > > > > >> > >> > >> > >    like percent_request_time and be in
> > > > {0,...100}
> > > > > or
> > > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > > >> > >> > >> > > and be
> > > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is
> the
> > > > > > > terminology
> > > > > > > > we
> > > > > > > > > >> used
> > > > > > > > > >> > >> if
> > > > > > > > > >> > >> > the
> > > > > > > > > >> > >> > >> > > scale
> > > > > > > > > >> > >> > >> > >    is between 0 and 1 in the other
> metrics,
> > > > > right?)
> > > > > > > > > >> > >> > >> > >
> > > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > > >> > >> > >> > >
> > > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini
> > > Sivaram
> > > > <
> > > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > > >> > >> > >> > >
> > > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > > >> > >> > >> > >
> > > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > > >> > >> > >> > > >
> > > > > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > > > > >> > >> > >> > > >
> > > > > > > > > >> > >> > >> > > > Guozhang : I have updated the section on
> > > > > > > co-existence
> > > > > > > > of
> > > > > > > > > >> byte
> > > > > > > > > >> > >> rate
> > > > > > > > > >> > >> > >> and
> > > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > > >> > >> > >> > > >
> > > > > > > > > >> > >> > >> > > > Dong: I hadn't added much detail to the
> > > > metrics
> > > > > > and
> > > > > > > > > >> sensors
> > > > > > > > > >> > >> since
> > > > > > > > > >> > >> > >> they
> > > > > > > > > >> > >> > >> > > are
> > > > > > > > > >> > >> > >> > > > going to be very similar to the existing
> > > > metrics
> > > > > > and
> > > > > > > > > >> sensors.
> > > > > > > > > >> > >> To
> > > > > > > > > >> > >> > >> avoid
> > > > > > > > > >> > >> > >> > > > confusion, I have now added more detail.
> > All
> > > > > > metrics
> > > > > > > > are
> > > > > > > > > >> in
> > > > > > > > > >> > the
> > > > > > > > > >> > >> > >> group
> > > > > > > > > >> > >> > >> > > > "quotaType" and all sensors have names
> > > > starting
> > > > > > with
> > > > > > > > > >> > >> "quotaType"
> > > > > > > > > >> > >> > >> (where
> > > > > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/
> > > LeaderReplication/
> > > > > > > > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > > > > > > > >> > >> > >> > > > So there will be no reuse of existing
> > > > > > > metrics/sensors.
> > > > > > > > > The
> > > > > > > > > >> > new
> > > > > > > > > >> > >> > ones
> > > > > > > > > >> > >> > >> for
> > > > > > > > > >> > >> > >> > > > request processing time based throttling
> > > will
> > > > be
> > > > > > > > > >> completely
> > > > > > > > > >> > >> > >> independent
> > > > > > > > > >> > >> > >> > > of
> > > > > > > > > >> > >> > >> > > > existing metrics/sensors, but will be
> > > > consistent
> > > > > > in
> > > > > > > > > >> format.
> > > > > > > > > >> > >> > >> > > >
> > > > > > > > > >> > >> > >> > > > The existing throttle_time_ms field in
> > > > > > produce/fetch
> > > > > > > > > >> > responses
> > > > > > > > > >> > >> > will
> > > > > > > > > >> > >> > >> not
> > > > > > > > > >> > >> > >> > > be
> > > > > > > > > >> > >> > >> > > > impacted by this KIP. That will continue
> > to
> > > > > return
> > > > > > > > > >> byte-rate
> > > > > > > > > >> > >> based
> > > > > > > > > >> > >> > >> > > > throttling times. In addition, a new
> field
> > > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > > >> > >> > >> > will
> > > > > > > > > >> > >> > >> > > be
> > > > > > > > > >> > >> > >> > > > added to return request quota based
> > > throttling
> > > > > > > times.
> > > > > > > > > >> These
> > > > > > > > > >> > >> will
> > > > > > > > > >> > >> > be
> > > > > > > > > >> > >> > >> > > exposed
> > > > > > > > > >> > >> > >> > > > as new metrics on the client-side.
> > > > > > > > > >> > >> > >> > > >
> > > > > > > > > >> > >> > >> > > > Since all metrics and sensors are
> > different
> > > > for
> > > > > > each
> > > > > > > > > type
> > > > > > > > > >> of
> > > > > > > > > >> > >> > quota,
> > > > > > > > > >> > >> > >> I
> > > > > > > > > >> > >> > >> > > > believe there is already sufficient
> > metrics
> > > to
> > > > > > > monitor
> > > > > > > > > >> > >> throttling
> > > > > > > > > >> > >> > on
> > > > > > > > > >> > >> > >> > both
> > > > > > > > > >> > >> > >> > > > client and broker side for each type of
> > > > > > throttling.
> > > > > > > > > >> > >> > >> > > >
> > > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > > >> > >> > >> > > >
> > > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > > >> > >> > >> > > >
> > > > > > > > > >> > >> > >> > > >
> > > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong
> Lin
> > <
> > > > > > > > > >> > lindong28@gmail.com
> > > > > > > > > >> > >> >
> > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > >> > >> > >> > > >
> > > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > > I think it makes a lot of sense to use
> > > > > > > > io_thread_units
> > > > > > > > > >> as
> > > > > > > > > >> > >> metric
> > > > > > > > > >> > >> > >> to
> > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > >> > >> > >> > > > > user's traffic here. LGTM overall. I
> > have
> > > > some
> > > > > > > > > questions
> > > > > > > > > >> > >> > regarding
> > > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > > - Can you be more specific in the KIP
> > what
> > > > > > sensors
> > > > > > > > > will
> > > > > > > > > >> be
> > > > > > > > > >> > >> > added?
> > > > > > > > > >> > >> > >> For
> > > > > > > > > >> > >> > >> > > > > example, it will be useful to specify
> > the
> > > > name
> > > > > > and
> > > > > > > > > >> > >> attributes of
> > > > > > > > > >> > >> > >> > these
> > > > > > > > > >> > >> > >> > > > new
> > > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > > - We currently have throttle-time and
> > > > > queue-size
> > > > > > > for
> > > > > > > > > >> > >> byte-rate
> > > > > > > > > >> > >> > >> based
> > > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > > >> > >> > >> > > > > Are you going to have separate
> > > throttle-time
> > > > > and
> > > > > > > > > >> queue-size
> > > > > > > > > >> > >> for
> > > > > > > > > >> > >> > >> > > requests
> > > > > > > > > >> > >> > >> > > > > throttled by io_thread_unit-based
> quota,
> > > or
> > > > > will
> > > > > > > > they
> > > > > > > > > >> share
> > > > > > > > > >> > >> the
> > > > > > > > > >> > >> > >> same
> > > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > > - Does the throttle-time in the
> > > > > ProduceResponse
> > > > > > > and
> > > > > > > > > >> > >> > FetchResponse
> > > > > > > > > >> > >> > >> > > > contains
> > > > > > > > > >> > >> > >> > > > > time due to io_thread_unit-based
> quota?
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > > - Currently kafka server doesn't not
> > > provide
> > > > > any
> > > > > > > log
> > > > > > > > > or
> > > > > > > > > >> > >> metrics
> > > > > > > > > >> > >> > >> that
> > > > > > > > > >> > >> > >> > > > tells
> > > > > > > > > >> > >> > >> > > > > whether any given clientId (or user)
> is
> > > > > > throttled.
> > > > > > > > > This
> > > > > > > > > >> is
> > > > > > > > > >> > >> not
> > > > > > > > > >> > >> > too
> > > > > > > > > >> > >> > >> > bad
> > > > > > > > > >> > >> > >> > > > > because we can still check the
> > client-side
> > > > > > > byte-rate
> > > > > > > > > >> metric
> > > > > > > > > >> > >> to
> > > > > > > > > >> > >> > >> > validate
> > > > > > > > > >> > >> > >> > > > > whether a given client is throttled.
> But
> > > > with
> > > > > > this
> > > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > > >> > >> > >> > > there
> > > > > > > > > >> > >> > >> > > > > will be no way to validate whether a
> > given
> > > > > > client
> > > > > > > is
> > > > > > > > > >> slow
> > > > > > > > > >> > >> > because
> > > > > > > > > >> > >> > >> it
> > > > > > > > > >> > >> > >> > > has
> > > > > > > > > >> > >> > >> > > > > exceeded its io_thread_unit limit. It
> is
> > > > > > necessary
> > > > > > > > for
> > > > > > > > > >> user
> > > > > > > > > >> > >> to
> > > > > > > > > >> > >> > be
> > > > > > > > > >> > >> > >> > able
> > > > > > > > > >> > >> > >> > > to
> > > > > > > > > >> > >> > >> > > > > know this information to figure how
> > > whether
> > > > > they
> > > > > > > > have
> > > > > > > > > >> > reached
> > > > > > > > > >> > >> > >> there
> > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > >> > >> > >> > > > > limit. How about we add log4j log on
> the
> > > > > server
> > > > > > > side
> > > > > > > > > to
> > > > > > > > > >> > >> > >> periodically
> > > > > > > > > >> > >> > >> > > > print
> > > > > > > > > >> > >> > >> > > > > the (client_id,
> byte-rate-throttle-time,
> > > > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > > > >> > >> > >> > > so
> > > > > > > > > >> > >> > >> > > > > that kafka administrator can figure
> > those
> > > > > users
> > > > > > > that
> > > > > > > > > >> have
> > > > > > > > > >> > >> > reached
> > > > > > > > > >> > >> > >> > their
> > > > > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM,
> > Guozhang
> > > > > Wang <
> > > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >> > >> > >> > > > > > Made a pass over the doc, overall
> LGTM
> > > > > except
> > > > > > a
> > > > > > > > > minor
> > > > > > > > > >> > >> comment
> > > > > > > > > >> > >> > on
> > > > > > > > > >> > >> > >> > the
> > > > > > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > >> > >> > >> > > > > > Stated as "Request processing time
> > > > > throttling
> > > > > > > will
> > > > > > > > > be
> > > > > > > > > >> > >> applied
> > > > > > > > > >> > >> > on
> > > > > > > > > >> > >> > >> > top
> > > > > > > > > >> > >> > >> > > if
> > > > > > > > > >> > >> > >> > > > > > necessary." I thought that it meant
> > the
> > > > > > request
> > > > > > > > > >> > processing
> > > > > > > > > >> > >> > time
> > > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > > >> > >> > >> > > > > > is applied first, but continue
> > reading I
> > > > > found
> > > > > > > it
> > > > > > > > > >> > actually
> > > > > > > > > >> > >> > >> meant to
> > > > > > > > > >> > >> > >> > > > apply
> > > > > > > > > >> > >> > >> > > > > > produce / fetch byte rate throttling
> > > > first.
> > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > >> > >> > >> > > > > > Also the last sentence "The
> remaining
> > > > delay
> > > > > if
> > > > > > > any
> > > > > > > > > is
> > > > > > > > > >> > >> applied
> > > > > > > > > >> > >> > to
> > > > > > > > > >> > >> > >> > the
> > > > > > > > > >> > >> > >> > > > > > response." is a bit confusing to me.
> > > Maybe
> > > > > > > > rewording
> > > > > > > > > >> it a
> > > > > > > > > >> > >> bit?
> > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun
> > > Rao <
> > > > > > > > > >> > jun@confluent.io
> > > > > > > > > >> > >> >
> > > > > > > > > >> > >> > >> wrote:
> > > > > > > > > >> > >> > >> > > > > >
> > > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > Thanks for the updated KIP. The
> > latest
> > > > > > > proposal
> > > > > > > > > >> looks
> > > > > > > > > >> > >> good
> > > > > > > > > >> > >> > to
> > > > > > > > > >> > >> > >> me.
> > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM,
> > > Rajini
> > > > > > > Sivaram
> > > > > > > > <
> > > > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > Thank you for the feedback.
> > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > 1. I have updated the KIP to use
> > > > > absolute
> > > > > > > > units
> > > > > > > > > >> > >> instead of
> > > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > > >> > >> > >> > > > > > > > property is called*
> > io_thread_units*
> > > > to
> > > > > > > align
> > > > > > > > > with
> > > > > > > > > >> > the
> > > > > > > > > >> > >> > >> thread
> > > > > > > > > >> > >> > >> > > count
> > > > > > > > > >> > >> > >> > > > > > > > property *num.io.threads*. When
> we
> > > > > > implement
> > > > > > > > > >> network
> > > > > > > > > >> > >> > thread
> > > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > > >> > >> > >> > > > > > > > quotas, we can add another
> > property
> > > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is already
> > > > listed
> > > > > > > under
> > > > > > > > > the
> > > > > > > > > >> > >> exempt
> > > > > > > > > >> > >> > >> > > requests.
> > > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > > >> > >> > >> > > > > > > > you mean a different request
> that
> > > > needs
> > > > > to
> > > > > > > be
> > > > > > > > > >> added?
> > > > > > > > > >> > >> The
> > > > > > > > > >> > >> > >> four
> > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > >> > >> > >> > > > > > > > currently exempt in the KIP are
> > > > > > StopReplica,
> > > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata.
> > > These
> > > > > are
> > > > > > > > > >> controlled
> > > > > > > > > >> > >> > using
> > > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy to exclude
> and
> > > only
> > > > > > > > throttle
> > > > > > > > > if
> > > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > > >> > >> > >> > > I
> > > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > > >> > >> > >> > > > > > > > sure if there are other requests
> > > used
> > > > > only
> > > > > > > for
> > > > > > > > > >> > >> > inter-broker
> > > > > > > > > >> > >> > >> > that
> > > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > 3. I was thinking the smallest
> > > change
> > > > > > would
> > > > > > > be
> > > > > > > > > to
> > > > > > > > > >> > >> replace
> > > > > > > > > >> > >> > >> all
> > > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > *requestChannel.sendResponse()*
> > > with
> > > > a
> > > > > > > local
> > > > > > > > > >> method
> > > > > > > > > >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()*
> that
> > > > does
> > > > > > the
> > > > > > > > > >> > throttling
> > > > > > > > > >> > >> if
> > > > > > > > > >> > >> > >> any
> > > > > > > > > >> > >> > >> > > plus
> > > > > > > > > >> > >> > >> > > > > send
> > > > > > > > > >> > >> > >> > > > > > > > response. If we throttle first
> in
> > > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > > >> > >> > the
> > > > > > > > > >> > >> > >> > time
> > > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > > >> > >> > >> > > > > > > > within the method handling the
> > > request
> > > > > > will
> > > > > > > > not
> > > > > > > > > be
> > > > > > > > > >> > >> > recorded
> > > > > > > > > >> > >> > >> or
> > > > > > > > > >> > >> > >> > > used
> > > > > > > > > >> > >> > >> > > > > in
> > > > > > > > > >> > >> > >> > > > > > > > throttling. We can look into
> this
> > > > again
> > > > > > when
> > > > > > > > the
> > > > > > > > > >> PR
> > > > > > > > > >> > is
> > > > > > > > > >> > >> > ready
> > > > > > > > > >> > >> > >> > for
> > > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM,
> > > Roger
> > > > > > > Hoover
> > > > > > > > <
> > > > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > Great to see this KIP and the
> > > > > excellent
> > > > > > > > > >> discussion.
> > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's suggestion makes
> > > sense.
> > > > > If
> > > > > > > my
> > > > > > > > > >> > >> application
> > > > > > > > > >> > >> > is
> > > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > > >> > >> > >> > > > > > > > > request handler unit, then
> it's
> > as
> > > > if
> > > > > I
> > > > > > > > have a
> > > > > > > > > >> > Kafka
> > > > > > > > > >> > >> > >> broker
> > > > > > > > > >> > >> > >> > > with
> > > > > > > > > >> > >> > >> > > > a
> > > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > > >> > >> > >> > > > > > > > > request handler thread
> dedicated
> > > to
> > > > > me.
> > > > > > > > > That's
> > > > > > > > > >> the
> > > > > > > > > >> > >> > most I
> > > > > > > > > >> > >> > >> > can
> > > > > > > > > >> > >> > >> > > > use,
> > > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > > >> > >> > >> > > > > > > > > least.  That allocation
> doesn't
> > > > change
> > > > > > > even
> > > > > > > > if
> > > > > > > > > >> an
> > > > > > > > > >> > >> admin
> > > > > > > > > >> > >> > >> later
> > > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > size of the request thread
> pool
> > on
> > > > the
> > > > > > > > broker.
> > > > > > > > > >> > It's
> > > > > > > > > >> > >> > >> similar
> > > > > > > > > >> > >> > >> > to
> > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > > >> > >> > >> > > > > > > > > abstraction that VMs and
> > > containers
> > > > > get
> > > > > > > from
> > > > > > > > > >> > >> hypervisors
> > > > > > > > > >> > >> > >> or
> > > > > > > > > >> > >> > >> > OS
> > > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > > >> > >> > >> > > > > > > > > While different client access
> > > > patterns
> > > > > > can
> > > > > > > > use
> > > > > > > > > >> > wildly
> > > > > > > > > >> > >> > >> > different
> > > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > >> > >> > >> > > > > > > > > request thread resources per
> > > > request,
> > > > > a
> > > > > > > > given
> > > > > > > > > >> > >> > application
> > > > > > > > > >> > >> > >> > will
> > > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > > >> > >> > >> > > > > > > > > have a stable access pattern
> and
> > > can
> > > > > > > figure
> > > > > > > > > out
> > > > > > > > > >> > >> > >> empirically
> > > > > > > > > >> > >> > >> > how
> > > > > > > > > >> > >> > >> > > > > many
> > > > > > > > > >> > >> > >> > > > > > > > > "request thread units" it
> needs
> > to
> > > > > meet
> > > > > > > it's
> > > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53
> AM,
> > > Jun
> > > > > > Rao <
> > > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > Thanks for the updated KIP.
> A
> > > few
> > > > > more
> > > > > > > > > >> comments.
> > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> > > > request_time_percent
> > > > > > is
> > > > > > > > that
> > > > > > > > > >> it's
> > > > > > > > > >> > >> not
> > > > > > > > > >> > >> > an
> > > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > > >> > >> > >> > > > > > > > > > Let's say you give a user a
> > 10%
> > > > > limit.
> > > > > > > If
> > > > > > > > > the
> > > > > > > > > >> > admin
> > > > > > > > > >> > >> > >> doubles
> > > > > > > > > >> > >> > >> > > the
> > > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > >> > >> > >> > > > > > > > > > request handler threads,
> that
> > > user
> > > > > now
> > > > > > > > > >> actually
> > > > > > > > > >> > has
> > > > > > > > > >> > >> > >> twice
> > > > > > > > > >> > >> > >> > the
> > > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > > >> > >> > >> > > > > > > > > > capacity. This may confuse
> > > people
> > > > a
> > > > > > bit.
> > > > > > > > So,
> > > > > > > > > >> > >> perhaps
> > > > > > > > > >> > >> > >> > setting
> > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > > >> > >> > >> > > > > > > > > > based on an absolute request
> > > > thread
> > > > > > unit
> > > > > > > > is
> > > > > > > > > >> > better.
> > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest
> > is
> > > > also
> > > > > > an
> > > > > > > > > >> > >> inter-broker
> > > > > > > > > >> > >> > >> > request
> > > > > > > > > >> > >> > >> > > > and
> > > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > > be excluded from throttling.
> > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > 3. Implementation wise, I am
> > > > > wondering
> > > > > > > if
> > > > > > > > > it's
> > > > > > > > > >> > >> simpler
> > > > > > > > > >> > >> > >> to
> > > > > > > > > >> > >> > >> > > apply
> > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > >> > >> > >> > > > > > > > > > time throttling first in
> > > > > > > > KafkaApis.handle().
> > > > > > > > > >> > >> > Otherwise,
> > > > > > > > > >> > >> > >> we
> > > > > > > > > >> > >> > >> > > will
> > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > > >> > >> > >> > > > > > > > > > the throttling logic in each
> > > type
> > > > of
> > > > > > > > > request.
> > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58
> > AM,
> > > > > > Rajini
> > > > > > > > > >> Sivaram <
> > > > > > > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > Thank you for the review.
> > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > I have reverted to the
> > > original
> > > > > KIP
> > > > > > > that
> > > > > > > > > >> > >> throttles
> > > > > > > > > >> > >> > >> based
> > > > > > > > > >> > >> > >> > on
> > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > > >> > >> > >> > > > > > > > > > > utilization. At the
> moment,
> > it
> > > > > uses
> > > > > > > > > >> percentage,
> > > > > > > > > >> > >> but
> > > > > > > > > >> > >> > I
> > > > > > > > > >> > >> > >> am
> > > > > > > > > >> > >> > >> > > > happy
> > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > > > a fraction (out of 1
> instead
> > > of
> > > > > 100)
> > > > > > > if
> > > > > > > > > >> > >> required. I
> > > > > > > > > >> > >> > >> have
> > > > > > > > > >> > >> > >> > > > added
> > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > > >> > >> > >> > > > > > > > > > > from this discussion to
> the
> > > KIP.
> > > > > > Also
> > > > > > > > > added
> > > > > > > > > >> a
> > > > > > > > > >> > >> > "Future
> > > > > > > > > >> > >> > >> > Work"
> > > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > > > address network thread
> > > > > utilization.
> > > > > > > The
> > > > > > > > > >> > >> > configuration
> > > > > > > > > >> > >> > >> is
> > > > > > > > > >> > >> > >> > > > named
> > > > > > > > > >> > >> > >> > > > > > > > > > > "request_time_percent"
> with
> > > the
> > > > > > > > > expectation
> > > > > > > > > >> > that
> > > > > > > > > >> > >> it
> > > > > > > > > >> > >> > >> can
> > > > > > > > > >> > >> > >> > > also
> > > > > > > > > >> > >> > >> > > > be
> > > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > limit for network thread
> > > > > utilization
> > > > > > > > when
> > > > > > > > > >> that
> > > > > > > > > >> > is
> > > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > > >> > >> > >> > > > > so
> > > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > > >> > >> > >> > > > > > > > > > > users have to set only one
> > > > config
> > > > > > for
> > > > > > > > the
> > > > > > > > > >> two
> > > > > > > > > >> > and
> > > > > > > > > >> > >> > not
> > > > > > > > > >> > >> > >> > have
> > > > > > > > > >> > >> > >> > > to
> > > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > > >> > >> > >> > > > > > > > > > > the internal distribution
> of
> > > the
> > > > > > work
> > > > > > > > > >> between
> > > > > > > > > >> > the
> > > > > > > > > >> > >> > two
> > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at
> > 12:23
> > > > AM,
> > > > > > Jun
> > > > > > > > Rao
> > > > > > > > > <
> > > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > The benefit of using the
> > > > request
> > > > > > > > > >> processing
> > > > > > > > > >> > >> time
> > > > > > > > > >> > >> > >> over
> > > > > > > > > >> > >> > >> > the
> > > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > >> > >> > >> > > > > > > > > > > > exactly what people have
> > > > said. I
> > > > > > > will
> > > > > > > > > just
> > > > > > > > > >> > >> expand
> > > > > > > > > >> > >> > >> that
> > > > > > > > > >> > >> > >> > a
> > > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > > following case. The
> > producer
> > > > > > sends a
> > > > > > > > > >> produce
> > > > > > > > > >> > >> > request
> > > > > > > > > >> > >> > >> > > with a
> > > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > > >> > >> > >> > > > > > > > > > > > but compressed to 100KB
> > with
> > > > > gzip.
> > > > > > > The
> > > > > > > > > >> > >> > >> decompression of
> > > > > > > > > >> > >> > >> > > the
> > > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > > broker could take 10-15
> > > > seconds,
> > > > > > > > during
> > > > > > > > > >> which
> > > > > > > > > >> > >> > time,
> > > > > > > > > >> > >> > >> a
> > > > > > > > > >> > >> > >> > > > request
> > > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > > >> > >> > >> > > > > > > > > > > > thread is completely
> > > blocked.
> > > > In
> > > > > > > this
> > > > > > > > > >> case,
> > > > > > > > > >> > >> > neither
> > > > > > > > > >> > >> > >> the
> > > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > > >> > >> > >> > > > > > > > > > > > the request rate quota
> may
> > > be
> > > > > > > > effective
> > > > > > > > > in
> > > > > > > > > >> > >> > >> protecting
> > > > > > > > > >> > >> > >> > the
> > > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > > >> > >> > >> > > > > > > > > > > > another case. A consumer
> > > group
> > > > > > > starts
> > > > > > > > > >> with 10
> > > > > > > > > >> > >> > >> instances
> > > > > > > > > >> > >> > >> > > and
> > > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > > >> > >> > >> > > > > > > > > > > > switches to 20
> instances.
> > > The
> > > > > > > request
> > > > > > > > > rate
> > > > > > > > > >> > will
> > > > > > > > > >> > >> > >> likely
> > > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > > actually load on the
> > broker
> > > > may
> > > > > > not
> > > > > > > > > double
> > > > > > > > > >> > >> since
> > > > > > > > > >> > >> > >> each
> > > > > > > > > >> > >> > >> > > fetch
> > > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > > >> > >> > >> > > > > > > > > > > > contains half of the
> > > > partitions.
> > > > > > > > Request
> > > > > > > > > >> rate
> > > > > > > > > >> > >> > quota
> > > > > > > > > >> > >> > >> may
> > > > > > > > > >> > >> > >> > > not
> > > > > > > > > >> > >> > >> > > > > be
> > > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > > > > configure in this case.
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > What we really want is
> to
> > be
> > > > > able
> > > > > > to
> > > > > > > > > >> prevent
> > > > > > > > > >> > a
> > > > > > > > > >> > >> > >> client
> > > > > > > > > >> > >> > >> > > from
> > > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > > >> > >> > >> > > > > > > > > > > > of the server side
> > > resources.
> > > > In
> > > > > > > this
> > > > > > > > > >> > >> particular
> > > > > > > > > >> > >> > >> KIP,
> > > > > > > > > >> > >> > >> > > this
> > > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > > capacity of the request
> > > > handler
> > > > > > > > > threads. I
> > > > > > > > > >> > >> agree
> > > > > > > > > >> > >> > >> that
> > > > > > > > > >> > >> > >> > it
> > > > > > > > > >> > >> > >> > > > may
> > > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > > >> > >> > >> > > > > > > > > > > > intuitive for the users
> to
> > > > > > determine
> > > > > > > > how
> > > > > > > > > >> to
> > > > > > > > > >> > set
> > > > > > > > > >> > >> > the
> > > > > > > > > >> > >> > >> > right
> > > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > > >> > >> > >> > > > > > > > > > > > this is not completely
> new
> > > and
> > > > > has
> > > > > > > > been
> > > > > > > > > >> done
> > > > > > > > > >> > in
> > > > > > > > > >> > >> > the
> > > > > > > > > >> > >> > >> > > > container
> > > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > > >> > >> > >> > > > > > > > > > > > already. For example,
> > Linux
> > > > > > cgroup (
> > > > > > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > documentation/en-US/Red_Hat_En
> > > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > Resource_Management_Guide/sec-
> > > > > > > > cpu.html)
> > > > > > > > > >> has
> > > > > > > > > >> > >> the
> > > > > > > > > >> > >> > >> > concept
> > > > > > > > > >> > >> > >> > > of
> > > > > > > > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > >> > >> > >> > > > > > > > > > > > which specifies the
> total
> > > > amount
> > > > > > of
> > > > > > > > time
> > > > > > > > > >> in
> > > > > > > > > >> > >> > >> > microseconds
> > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup can
> run
> > > > > during a
> > > > > > > one
> > > > > > > > > >> second
> > > > > > > > > >> > >> > >> period.
> > > > > > > > > >> > >> > >> > We
> > > > > > > > > >> > >> > >> > > > can
> > > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > > >> > >> > >> > > > > > > > > > > > model the request
> handler
> > > > > threads
> > > > > > > in a
> > > > > > > > > >> > similar
> > > > > > > > > >> > >> > way.
> > > > > > > > > >> > >> > >> For
> > > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > > >> > >> > >> > > > > > > > > > > > request handler thread
> can
> > > be
> > > > 1
> > > > > > > > request
> > > > > > > > > >> > handler
> > > > > > > > > >> > >> > unit
> > > > > > > > > >> > >> > >> > and
> > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > > >> > >> > >> > > > > > > > > > > > configure a limit on how
> > > many
> > > > > > units
> > > > > > > > (say
> > > > > > > > > >> > 0.01)
> > > > > > > > > >> > >> a
> > > > > > > > > >> > >> > >> client
> > > > > > > > > >> > >> > >> > > can
> > > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > Regarding not throttling
> > the
> > > > > > > internal
> > > > > > > > > >> broker
> > > > > > > > > >> > to
> > > > > > > > > >> > >> > >> broker
> > > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > > >> > >> > >> > > > > > > > > > > > do that. Alternatively,
> we
> > > > could
> > > > > > > just
> > > > > > > > > let
> > > > > > > > > >> the
> > > > > > > > > >> > >> > admin
> > > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > > >> > >> > >> > > > > > > > > > > > for the kafka user (it
> may
> > > not
> > > > > be
> > > > > > > able
> > > > > > > > > to
> > > > > > > > > >> do
> > > > > > > > > >> > >> that
> > > > > > > > > >> > >> > >> > easily
> > > > > > > > > >> > >> > >> > > > > based
> > > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we want to be
> able
> > > to
> > > > > > > protect
> > > > > > > > > the
> > > > > > > > > >> > >> > >> utilization
> > > > > > > > > >> > >> > >> > of
> > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > > >> > >> > >> > > > > > > > > > > > pool too. The difficult
> is
> > > > > mostly
> > > > > > > what
> > > > > > > > > >> Rajini
> > > > > > > > > >> > >> > said:
> > > > > > > > > >> > >> > >> (1)
> > > > > > > > > >> > >> > >> > > The
> > > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > >> > >> > >> > > > > > > > > > > > throttling the requests
> is
> > > > > through
> > > > > > > > > >> Purgatory
> > > > > > > > > >> > >> and
> > > > > > > > > >> > >> > we
> > > > > > > > > >> > >> > >> > will
> > > > > > > > > >> > >> > >> > > > have
> > > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > > >> > >> > >> > > > > > > > > > > > through how to integrate
> > > that
> > > > > into
> > > > > > > the
> > > > > > > > > >> > network
> > > > > > > > > >> > >> > >> layer.
> > > > > > > > > >> > >> > >> > > (2)
> > > > > > > > > >> > >> > >> > > > In
> > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > > >> > >> > >> > > > > > > > > > > > layer, currently we know
> > the
> > > > > user,
> > > > > > > but
> > > > > > > > > not
> > > > > > > > > >> > the
> > > > > > > > > >> > >> > >> clientId
> > > > > > > > > >> > >> > >> > > of
> > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit tricky to
> > > throttle
> > > > > > based
> > > > > > > on
> > > > > > > > > >> > clientId
> > > > > > > > > >> > >> > >> there.
> > > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > > >> > >> > >> > > > > > > > > > > > quota can already
> protect
> > > the
> > > > > > > network
> > > > > > > > > >> thread
> > > > > > > > > >> > >> > >> > utilization
> > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > > >> > >> > >> > > > > > > > > > > > requests. So, if we
> can't
> > > > figure
> > > > > > out
> > > > > > > > > this
> > > > > > > > > >> > part
> > > > > > > > > >> > >> > right
> > > > > > > > > >> > >> > >> > now,
> > > > > > > > > >> > >> > >> > > > > just
> > > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > >> > >> > >> > > > > > > > > > > > the request handling
> > threads
> > > > for
> > > > > > > this
> > > > > > > > > KIP
> > > > > > > > > >> is
> > > > > > > > > >> > >> > still a
> > > > > > > > > >> > >> > >> > > useful
> > > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at
> > 4:27
> > > > AM,
> > > > > > > > Rajini
> > > > > > > > > >> > >> Sivaram <
> > > > > > > > > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank you all for the
> > > > > feedback.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have removed
> > > > exemption
> > > > > > for
> > > > > > > > > >> consumer
> > > > > > > > > >> > >> > >> heartbeat
> > > > > > > > > >> > >> > >> > > etc.
> > > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > protecting the cluster
> > is
> > > > more
> > > > > > > > > important
> > > > > > > > > >> > than
> > > > > > > > > >> > >> > >> > > protecting
> > > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > Have retained the
> > > exemption
> > > > > for
> > > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled only if
> > > > > authorization
> > > > > > > > fails
> > > > > > > > > >> (so
> > > > > > > > > >> > >> can't
> > > > > > > > > >> > >> > be
> > > > > > > > > >> > >> > >> > used
> > > > > > > > > >> > >> > >> > > > for
> > > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > a secure cluster, but
> > > allows
> > > > > > > > > >> inter-broker
> > > > > > > > > >> > >> > >> requests to
> > > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > I will wait another
> day
> > to
> > > > see
> > > > > > if
> > > > > > > > > these
> > > > > > > > > >> is
> > > > > > > > > >> > >> any
> > > > > > > > > >> > >> > >> > > objection
> > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > request processing
> time
> > > (as
> > > > > > > opposed
> > > > > > > > to
> > > > > > > > > >> > >> request
> > > > > > > > > >> > >> > >> rate)
> > > > > > > > > >> > >> > >> > > and
> > > > > > > > > >> > >> > >> > > > if
> > > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > objections, I will
> > revert
> > > to
> > > > > the
> > > > > > > > > >> original
> > > > > > > > > >> > >> > proposal
> > > > > > > > > >> > >> > >> > with
> > > > > > > > > >> > >> > >> > > > > some
> > > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > The original proposal
> > was
> > > > only
> > > > > > > > > including
> > > > > > > > > >> > the
> > > > > > > > > >> > >> > time
> > > > > > > > > >> > >> > >> > used
> > > > > > > > > >> > >> > >> > > by
> > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > handler threads (that
> > made
> > > > > > > > calculation
> > > > > > > > > >> > >> easy). I
> > > > > > > > > >> > >> > >> think
> > > > > > > > > >> > >> > >> > > the
> > > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > include the time spent
> > in
> > > > the
> > > > > > > > network
> > > > > > > > > >> > >> threads as
> > > > > > > > > >> > >> > >> well
> > > > > > > > > >> > >> > >> > > > since
> > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant. As Jay
> > > pointed
> > > > > out,
> > > > > > > it
> > > > > > > > is
> > > > > > > > > >> more
> > > > > > > > > >> > >> > >> > complicated
> > > > > > > > > >> > >> > >> > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > total available CPU
> time
> > > and
> > > > > > > convert
> > > > > > > > > to
> > > > > > > > > >> a
> > > > > > > > > >> > >> ratio
> > > > > > > > > >> > >> > >> when
> > > > > > > > > >> > >> > >> > > > there
> > > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > and *n* network
> threads.
> > > > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > > >> > >> > >> > > )
> > > > > > > > > >> > >> > >> > > > > may
> > > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > we want, but it can be
> > > very
> > > > > > > > expensive
> > > > > > > > > on
> > > > > > > > > >> > some
> > > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > > >> > >> > >> > > > As
> > > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang have pointed
> > out,
> > > > we
> > > > > do
> > > > > > > > have
> > > > > > > > > >> > several
> > > > > > > > > >> > >> > time
> > > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > generating metrics
> that
> > we
> > > > > could
> > > > > > > > use,
> > > > > > > > > >> > though
> > > > > > > > > >> > >> we
> > > > > > > > > >> > >> > >> might
> > > > > > > > > >> > >> > >> > > > want
> > > > > > > > > >> > >> > >> > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime() instead of
> > > > > > > > > >> currentTimeMillis()
> > > > > > > > > >> > >> since
> > > > > > > > > >> > >> > >> some
> > > > > > > > > >> > >> > >> > of
> > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > small requests may be
> <
> > > 1ms.
> > > > > But
> > > > > > > > > rather
> > > > > > > > > >> > than
> > > > > > > > > >> > >> add
> > > > > > > > > >> > >> > >> up
> > > > > > > > > >> > >> > >> > the
> > > > > > > > > >> > >> > >> > > > > time
> > > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > thread and network
> > thread,
> > > > > > > wouldn't
> > > > > > > > it
> > > > > > > > > >> be
> > > > > > > > > >> > >> better
> > > > > > > > > >> > >> > >> to
> > > > > > > > > >> > >> > >> > > > convert
> > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > on each thread into a
> > > > separate
> > > > > > > > ratio?
> > > > > > > > > >> UserA
> > > > > > > > > >> > >> has
> > > > > > > > > >> > >> > a
> > > > > > > > > >> > >> > >> > > request
> > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > we take that to mean
> > that
> > > > > UserA
> > > > > > > can
> > > > > > > > > use
> > > > > > > > > >> 5%
> > > > > > > > > >> > of
> > > > > > > > > >> > >> > the
> > > > > > > > > >> > >> > >> > time
> > > > > > > > > >> > >> > >> > > on
> > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of the time on
> > I/O
> > > > > > threads?
> > > > > > > > If
> > > > > > > > > >> > either
> > > > > > > > > >> > >> is
> > > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > > >> > >> > >> > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled - it would
> > mean
> > > > > > > > maintaining
> > > > > > > > > >> two
> > > > > > > > > >> > >> sets
> > > > > > > > > >> > >> > of
> > > > > > > > > >> > >> > >> > > metrics
> > > > > > > > > >> > >> > >> > > > > for
> > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > durations, but would
> > > result
> > > > in
> > > > > > > more
> > > > > > > > > >> > >> meaningful
> > > > > > > > > >> > >> > >> > ratios.
> > > > > > > > > >> > >> > >> > > We
> > > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > quota limits (UserA
> has
> > 5%
> > > > of
> > > > > > > > request
> > > > > > > > > >> > threads
> > > > > > > > > >> > >> > and
> > > > > > > > > >> > >> > >> 10%
> > > > > > > > > >> > >> > >> > > of
> > > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > but that seems
> > unnecessary
> > > > and
> > > > > > > > harder
> > > > > > > > > to
> > > > > > > > > >> > >> explain
> > > > > > > > > >> > >> > >> to
> > > > > > > > > >> > >> > >> > > > users.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > Back to why and how
> > quotas
> > > > are
> > > > > > > > applied
> > > > > > > > > >> to
> > > > > > > > > >> > >> > network
> > > > > > > > > >> > >> > >> > > thread
> > > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In the case of
> fetch,
> > > > the
> > > > > > time
> > > > > > > > > >> spent in
> > > > > > > > > >> > >> the
> > > > > > > > > >> > >> > >> > network
> > > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > significant and I can
> > see
> > > > the
> > > > > > need
> > > > > > > > to
> > > > > > > > > >> > include
> > > > > > > > > >> > >> > >> this.
> > > > > > > > > >> > >> > >> > Are
> > > > > > > > > >> > >> > >> > > > > there
> > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > requests where the
> > network
> > > > > > thread
> > > > > > > > > >> > >> utilization is
> > > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > of fetch, request
> > handler
> > > > > thread
> > > > > > > > > >> > utilization
> > > > > > > > > >> > >> > would
> > > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > high request rate, low
> > > data
> > > > > > volume
> > > > > > > > and
> > > > > > > > > >> > fetch
> > > > > > > > > >> > >> > byte
> > > > > > > > > >> > >> > >> > rate
> > > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > clients with high data
> > > > volume.
> > > > > > > > Network
> > > > > > > > > >> > thread
> > > > > > > > > >> > >> > >> > > utilization
> > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > proportional to the
> data
> > > > > > volume. I
> > > > > > > > am
> > > > > > > > > >> > >> wondering
> > > > > > > > > >> > >> > >> if we
> > > > > > > > > >> > >> > >> > > > even
> > > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > based on network
> thread
> > > > > > > utilization
> > > > > > > > or
> > > > > > > > > >> > >> whether
> > > > > > > > > >> > >> > the
> > > > > > > > > >> > >> > >> > data
> > > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At the moment, we
> > > record
> > > > > and
> > > > > > > > check
> > > > > > > > > >> for
> > > > > > > > > >> > >> quota
> > > > > > > > > >> > >> > >> > > violation
> > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > If a quota is
> violated,
> > > the
> > > > > > > response
> > > > > > > > > is
> > > > > > > > > >> > >> delayed.
> > > > > > > > > >> > >> > >> > Using
> > > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > disk reads for fetches
> > > > > happening
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > >> > >> network
> > > > > > > > > >> > >> > >> > thread,
> > > > > > > > > >> > >> > >> > > > We
> > > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > delay a response after
> > the
> > > > > disk
> > > > > > > > reads.
> > > > > > > > > >> We
> > > > > > > > > >> > >> could
> > > > > > > > > >> > >> > >> > record
> > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > the network thread
> when
> > > the
> > > > > > > response
> > > > > > > > > is
> > > > > > > > > >> > >> complete
> > > > > > > > > >> > >> > >> and
> > > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling a subsequent
> > > > request
> > > > > > > > > (separate
> > > > > > > > > >> out
> > > > > > > > > >> > >> > >> recording
> > > > > > > > > >> > >> > >> > > and
> > > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > handling in the case
> of
> > > > > network
> > > > > > > > thread
> > > > > > > > > >> > >> > overload).
> > > > > > > > > >> > >> > >> > Does
> > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017
> at
> > > 2:58
> > > > > AM,
> > > > > > > > > Becket
> > > > > > > > > >> > Qin <
> > > > > > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that
> > > > enforcing
> > > > > > the
> > > > > > > > CPU
> > > > > > > > > >> time
> > > > > > > > > >> > >> is a
> > > > > > > > > >> > >> > >> > little
> > > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > that maybe we can
> use
> > > the
> > > > > > > existing
> > > > > > > > > >> > request
> > > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > > >> > >> > >> > > > > They
> > > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > very detailed so we
> > can
> > > > > > probably
> > > > > > > > see
> > > > > > > > > >> the
> > > > > > > > > >> > >> > >> > approximate
> > > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > something like
> > > > (total_time -
> > > > > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > > > > >> > >> > >> > > > > -
> > > > > > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I agree with
> Guozhang
> > > that
> > > > > > when
> > > > > > > a
> > > > > > > > > >> user is
> > > > > > > > > >> > >> > >> throttled
> > > > > > > > > >> > >> > >> > > it
> > > > > > > > > >> > >> > >> > > > is
> > > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > need to see if
> > anything
> > > > has
> > > > > > went
> > > > > > > > > wrong
> > > > > > > > > >> > >> first,
> > > > > > > > > >> > >> > >> and
> > > > > > > > > >> > >> > >> > if
> > > > > > > > > >> > >> > >> > > > the
> > > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > behaving and just
> need
> > > > more
> > > > > > > > > >> resources, we
> > > > > > > > > >> > >> will
> > > > > > > > > >> > >> > >> have
> > > > > > > > > >> > >> > >> > > to
> > > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for them. It is true
> > > that
> > > > > > > > > >> pre-allocating
> > > > > > > > > >> > >> CPU
> > > > > > > > > >> > >> > >> time
> > > > > > > > > >> > >> > >> > > quota
> > > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > users is difficult.
> So
> > > in
> > > > > > > practice
> > > > > > > > > it
> > > > > > > > > >> > would
> > > > > > > > > >> > >> > >> > probably
> > > > > > > > > >> > >> > >> > > be
> > > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > a relative high
> > > protective
> > > > > CPU
> > > > > > > > time
> > > > > > > > > >> quota
> > > > > > > > > >> > >> for
> > > > > > > > > >> > >> > >> > > everyone
> > > > > > > > > >> > >> > >> > > > > and
> > > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > for some individual
> > > > clients
> > > > > on
> > > > > > > > > demand.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket)
> Qin
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017
> > at
> > > > 5:48
> > > > > > PM,
> > > > > > > > > >> Guozhang
> > > > > > > > > >> > >> > Wang <
> > > > > > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > This is a great
> > > > proposal,
> > > > > > glad
> > > > > > > > to
> > > > > > > > > >> see
> > > > > > > > > >> > it
> > > > > > > > > >> > >> > >> > happening.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am inclined to
> the
> > > CPU
> > > > > > > > > >> throttling, or
> > > > > > > > > >> > >> more
> > > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio instead of
> the
> > > > > request
> > > > > > > > rate
> > > > > > > > > >> > >> throttling
> > > > > > > > > >> > >> > >> as
> > > > > > > > > >> > >> > >> > > well.
> > > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > summed my
> rationales
> > > > > above,
> > > > > > > and
> > > > > > > > > one
> > > > > > > > > >> > >> thing to
> > > > > > > > > >> > >> > >> add
> > > > > > > > > >> > >> > >> > > here
> > > > > > > > > >> > >> > >> > > > > is
> > > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > has a good support
> > for
> > > > > both
> > > > > > > > > >> "protecting
> > > > > > > > > >> > >> > >> against
> > > > > > > > > >> > >> > >> > > rogue
> > > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a
> cluster
> > > for
> > > > > > > > > >> multi-tenancy
> > > > > > > > > >> > >> > usage":
> > > > > > > > > >> > >> > >> > when
> > > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > explain this to
> the
> > > end
> > > > > > > users, I
> > > > > > > > > >> find
> > > > > > > > > >> > it
> > > > > > > > > >> > >> > >> actually
> > > > > > > > > >> > >> > >> > > > more
> > > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > request rate since
> > as
> > > > > > > mentioned
> > > > > > > > > >> above,
> > > > > > > > > >> > >> > >> different
> > > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > different "cost",
> > and
> > > > > Kafka
> > > > > > > > today
> > > > > > > > > >> > already
> > > > > > > > > >> > >> > have
> > > > > > > > > >> > >> > >> > > > various
> > > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > (produce, fetch,
> > > admin,
> > > > > > > > metadata,
> > > > > > > > > >> etc),
> > > > > > > > > >> > >> > >> because
> > > > > > > > > >> > >> > >> > of
> > > > > > > > > >> > >> > >> > > > that
> > > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttling may not
> > be
> > > as
> > > > > > > > effective
> > > > > > > > > >> > >> unless it
> > > > > > > > > >> > >> > >> is
> > > > > > > > > >> > >> > >> > set
> > > > > > > > > >> > >> > >> > > > > very
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > Regarding to user
> > > > > reactions
> > > > > > > when
> > > > > > > > > >> they
> > > > > > > > > >> > are
> > > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > > >> > >> > >> > > > I
> > > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > case-by-case, and
> > need
> > > > to
> > > > > be
> > > > > > > > > >> > discovered /
> > > > > > > > > >> > >> > >> guided
> > > > > > > > > >> > >> > >> > by
> > > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > metrics. So in
> other
> > > > words
> > > > > > > users
> > > > > > > > > >> would
> > > > > > > > > >> > >> not
> > > > > > > > > >> > >> > >> expect
> > > > > > > > > >> > >> > >> > > to
> > > > > > > > > >> > >> > >> > > > > get
> > > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > information by
> > simply
> > > > > being
> > > > > > > told
> > > > > > > > > >> "hey,
> > > > > > > > > >> > >> you
> > > > > > > > > >> > >> > are
> > > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > what throttling
> > does;
> > > > they
> > > > > > > need
> > > > > > > > to
> > > > > > > > > >> > take a
> > > > > > > > > >> > >> > >> > follow-up
> > > > > > > > > >> > >> > >> > > > > step
> > > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttled probably
> > > > because
> > > > > > of
> > > > > > > > ..",
> > > > > > > > > >> > which
> > > > > > > > > >> > >> is
> > > > > > > > > >> > >> > by
> > > > > > > > > >> > >> > >> > > > looking
> > > > > > > > > >> > >> > >> > > > > at
> > > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > > >> > >> > >> > > > > > > > > > > > > > > values: e.g.
> whether
> > > I'm
> > > > > > > > > bombarding
> > > > > > > > > >> the
> > > > > > > > > >> > >> > >> brokers
> > > > > > > > > >> > >> > >> > > with
> > > > > > > > > >> > >> > >> > > > >
> > > > > > > > > >>
> > > > > > > > > > ...
> > > > > > > > > >
> > > > > > > > > > [Message clipped]
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
*Todd Palino*
Staff Site Reliability Engineer
Data Infrastructure Streaming



linkedin.com/in/toddpalino

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Becket Qin <be...@gmail.com>.
I see. Good point about SSL.

I just asked Todd to take a look.

Thanks,

Jiangjie (Becket) Qin

On Tue, Mar 7, 2017 at 2:17 PM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Jiangjie,
>
> Yes, I agree that byte rate already protects the network threads
> indirectly. I am not sure if byte rate fully captures the CPU overhead in
> network due to SSL. So, at the high level, we can use request time limit to
> protect CPU and use byte rate to protect storage and network.
>
> Also, do you think you can get Todd to comment on this KIP?
>
> Thanks,
>
> Jun
>
> On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <be...@gmail.com> wrote:
>
> > Hi Rajini/Jun,
> >
> > The percentage based reasoning sounds good.
> > One thing I am wondering is that if we assume the network thread are just
> > doing the network IO, can we say bytes rate quota is already sort of
> > network threads quota?
> > If we take network threads into the consideration here, would that be
> > somewhat overlapping with the bytes rate quota?
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> > On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > wrote:
> >
> > > Jun,
> > >
> > > Thank you for the explanation, I hadn't realized you meant percentage
> of
> > > the total thread pool. If everyone is OK with Jun's suggestion, I will
> > > update the KIP.
> > >
> > > Thanks,
> > >
> > > Rajini
> > >
> > > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Rajini,
> > > >
> > > > Let's take your example. Let's say a user sets the limit to 50%. I am
> > not
> > > > sure if it's better to apply the same percentage separately to
> network
> > > and
> > > > io thread pool. For example, for produce requests, most of the time
> > will
> > > be
> > > > spent in the io threads whereas for fetch requests, most of the time
> > will
> > > > be in the network threads. So, using the same percentage in both
> thread
> > > > pools means one of the pools' resource will be over allocated.
> > > >
> > > > An alternative way is to simply model network and io thread pool
> > > together.
> > > > If you get 10 io threads and 5 network threads, you get 1500% request
> > > > processing power. A 50% limit means a total of 750% processing power.
> > We
> > > > just add up the time a user request spent in either network or io
> > thread.
> > > > If that total exceeds 750% (doesn't matter whether it's spent more in
> > > > network or io thread), the request will be throttled. This seems more
> > > > general and is not sensitive to the current implementation detail of
> > > having
> > > > a separate network and io thread pool. In the future, if the
> threading
> > > > model changes, the same concept of quota can still be applied. For
> now,
> > > > since it's a bit tricky to add the delay logic in the network thread
> > > pool,
> > > > we could probably just do the delaying only in the io threads as you
> > > > suggested earlier.
> > > >
> > > > There is still the orthogonal question of whether a quota of 50% is
> out
> > > of
> > > > 100% or 100% * #total processing threads. My feeling is that the
> latter
> > > is
> > > > slightly better based on my explanation earlier. The way to describe
> > this
> > > > quota to the users can be "share of elapsed request processing time
> on
> > a
> > > > single CPU" (similar to top).
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> > rajinisivaram@gmail.com>
> > > > wrote:
> > > >
> > > > > Jun,
> > > > >
> > > > > Agree about the two scenarios.
> > > > >
> > > > > But still not sure about a single quota covering both network
> threads
> > > and
> > > > > I/O threads with per-thread quota. If there are 10 I/O threads and
> 5
> > > > > network threads and I want to assign half the quota to userA, the
> > quota
> > > > > would be 750%. I imagine, internally, we would convert this to 500%
> > for
> > > > I/O
> > > > > and 250% for network threads to allocate 50% of each pool.
> > > > >
> > > > > A couple of scenarios:
> > > > >
> > > > > 1. Admin adds 1 extra network thread. To retain 50%, admin needs to
> > now
> > > > > allocate 800% for each user. Or increase the quota for a few users.
> > To
> > > > me,
> > > > > it feels like admin needs to convert 50% to 800% and Kafka
> internally
> > > > needs
> > > > > to convert 800% to (500%, 300%). Everyone using just 50% feels a
> lot
> > > > > simpler.
> > > > >
> > > > > 2. We decide to add some other thread to this list. Admin needs to
> > know
> > > > > exactly how many threads form the maximum quota. And we can be
> > changing
> > > > > this between broker versions as we add more to the list. Again a
> > single
> > > > > overall percent would be a lot simpler.
> > > > >
> > > > > There were others who were unconvinced by a single percent from the
> > > > initial
> > > > > proposal and were happier with thread units similar to CPU units,
> so
> > I
> > > am
> > > > > ok with going with per-thread quotas (as units or percent). Just
> not
> > > sure
> > > > > it makes it easier for admin in all cases.
> > > > >
> > > > > Regards,
> > > > >
> > > > > Rajini
> > > > >
> > > > >
> > > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Rajini,
> > > > > >
> > > > > > Consider modeling as n * 100% unit. For 2), the question is
> what's
> > > > > causing
> > > > > > the I/O threads to be saturated. It's unlikely that all users'
> > > > > utilization
> > > > > > have increased at the same. A more likely case is that a few
> > isolated
> > > > > > users' utilization have increased. If so, after increasing the
> > number
> > > > of
> > > > > > threads, the admin just needs to adjust the quota for a few
> > isolated
> > > > > users,
> > > > > > which is expected and is less work.
> > > > > >
> > > > > > Consider modeling as 1 * 100% unit. For 1), all users' quota need
> > to
> > > be
> > > > > > adjusted, which is unexpected and is more work.
> > > > > >
> > > > > > So, to me, the n * 100% model seems more convenient.
> > > > > >
> > > > > > As for future extension to cover network thread utilization, I
> was
> > > > > thinking
> > > > > > that one way is to simply model the capacity as (n + m) * 100%
> > unit,
> > > > > where
> > > > > > n and m are the number of network and i/o threads, respectively.
> > > Then,
> > > > > for
> > > > > > each user, we can just add up the utilization in the network and
> > the
> > > > i/o
> > > > > > thread. If we do this, we don't need a new type of quota.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Jun,
> > > > > > >
> > > > > > > If we use request.percentage as the percentage used in a single
> > I/O
> > > > > > thread,
> > > > > > > the total percentage being allocated will be num.io.threads *
> 100
> > > for
> > > > > I/O
> > > > > > > threads and num.network.threads * 100 for network threads. A
> > single
> > > > > quota
> > > > > > > covering the two as a percentage wouldn't quite work if you
> want
> > to
> > > > > > > allocate the same proportion in both cases. If we want to treat
> > > > threads
> > > > > > as
> > > > > > > separate units, won't we need two quota configurations
> regardless
> > > of
> > > > > > > whether we use units or percentage? Perhaps I misunderstood
> your
> > > > > > > suggestion.
> > > > > > >
> > > > > > > I think there are two cases:
> > > > > > >
> > > > > > >    1. The use case that you mentioned where an admin is adding
> > more
> > > > > users
> > > > > > >    and decides to add more I/O threads and expects to find free
> > > quota
> > > > > to
> > > > > > >    allocate for new users.
> > > > > > >    2. Admin adds more I/O threads because the I/O threads are
> > > > saturated
> > > > > > and
> > > > > > >    there are cores available to allocate, even though the
> number
> > or
> > > > > > >    users/clients hasn't changed.
> > > > > > >
> > > > > > > If we allocated treated I/O threads as a single unit of 100%,
> all
> > > > user
> > > > > > > quotas need to be reallocated for 1). If we allocated I/O
> threads
> > > as
> > > > n
> > > > > > > units with n*100%, all user quotas need to be reallocated for
> 2),
> > > > > > otherwise
> > > > > > > some of the new threads may just not be used. Either way it
> > should
> > > be
> > > > > > easy
> > > > > > > to write a script to decrease/increase quotas by a multiple for
> > all
> > > > > > users.
> > > > > > >
> > > > > > > So it really boils down to which quota unit is most intuitive
> in
> > > > terms
> > > > > of
> > > > > > > configuration. And from the discussion so far, it feels like
> > > opinion
> > > > is
> > > > > > > divided on whether quotas should be carved out of an absolute
> > 100%
> > > > (or
> > > > > 1
> > > > > > > unit) or be relative to the number of threads (n*100% or n
> > units).
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > > > > >
> > > > > > > > Another way to express an absolute limit is to use
> > > > > request.percentage,
> > > > > > > but
> > > > > > > > treat it as the percentage used in a single request handling
> > > > thread.
> > > > > > For
> > > > > > > > now, the request handling threads can be just the io threads.
> > In
> > > > the
> > > > > > > > future, they can cover the network threads as well. This is
> > > similar
> > > > > to
> > > > > > > how
> > > > > > > > top reports CPU usage and may be a bit easier for people to
> > > > > understand.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <ju...@confluent.io>
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Jay,
> > > > > > > > >
> > > > > > > > > 2. Regarding request.unit vs request.percentage. I started
> > with
> > > > > > > > > request.percentage too. The reasoning for request.unit is
> the
> > > > > > > following.
> > > > > > > > > Suppose that the capacity has been reached on a broker and
> > the
> > > > > admin
> > > > > > > > needs
> > > > > > > > > to add a new user. A simple way to increase the capacity is
> > to
> > > > > > increase
> > > > > > > > the
> > > > > > > > > number of io threads, assuming there are still enough
> cores.
> > If
> > > > the
> > > > > > > limit
> > > > > > > > > is based on percentage, the additional capacity
> automatically
> > > > gets
> > > > > > > > > distributed to existing users and we haven't really carved
> > out
> > > > any
> > > > > > > > > additional resource for the new user. Now, is it easy for a
> > > user
> > > > to
> > > > > > > > reason
> > > > > > > > > about 0.1 unit vs 10%. My feeling is that both are hard and
> > > have
> > > > to
> > > > > > be
> > > > > > > > > configured empirically. Not sure if percentage is obviously
> > > > easier
> > > > > to
> > > > > > > > > reason about.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <
> jay@confluent.io
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > >> A couple of quick points:
> > > > > > > > >>
> > > > > > > > >> 1. Even though the implementation of this quota is only
> > using
> > > io
> > > > > > > thread
> > > > > > > > >> time, i think we should call it something like
> > "request-time".
> > > > > This
> > > > > > > will
> > > > > > > > >> give us flexibility to improve the implementation to cover
> > > > network
> > > > > > > > threads
> > > > > > > > >> in the future and will avoid exposing internal details
> like
> > > our
> > > > > > thread
> > > > > > > > >> pools on the server.
> > > > > > > > >>
> > > > > > > > >> 2. Jun/Roger, I get what you are trying to fix but the
> idea
> > of
> > > > > > > > >> thread/units
> > > > > > > > >> is super unintuitive as a user-facing knob. I had to read
> > the
> > > > KIP
> > > > > > like
> > > > > > > > >> eight times to understand this. I'm not sure that your
> point
> > > > that
> > > > > > > > >> increasing the number of threads is a problem with a
> > > > > > percentage-based
> > > > > > > > >> value, it really depends on whether the user thinks about
> > the
> > > > > > > > "percentage
> > > > > > > > >> of request processing time" or "thread units". If they
> think
> > > "I
> > > > > have
> > > > > > > > >> allocated 10% of my request processing time to user x"
> then
> > it
> > > > is
> > > > > a
> > > > > > > bug
> > > > > > > > >> that increasing the thread count decreases that percent as
> > it
> > > > does
> > > > > > in
> > > > > > > > the
> > > > > > > > >> current proposal. As a practical matter I think the only
> way
> > > to
> > > > > > > actually
> > > > > > > > >> reason about this is as a percent---I just don't believe
> > > people
> > > > > are
> > > > > > > > going
> > > > > > > > >> to think, "ah, 4.3 thread units, that is the right
> amount!".
> > > > > > Instead I
> > > > > > > > >> think they have to understand this thread unit concept,
> > figure
> > > > out
> > > > > > > what
> > > > > > > > >> they have set in number of threads, compute a percent and
> > then
> > > > > come
> > > > > > up
> > > > > > > > >> with
> > > > > > > > >> the number of thread units, and these will all be wrong if
> > > that
> > > > > > thread
> > > > > > > > >> count changes. I also think this ties us to throttling the
> > I/O
> > > > > > thread
> > > > > > > > >> pool,
> > > > > > > > >> which may not be where we want to end up.
> > > > > > > > >>
> > > > > > > > >> 3. For what it's worth I do think having a single
> > throttle_ms
> > > > > field
> > > > > > in
> > > > > > > > all
> > > > > > > > >> the responses that combines all throttling from all quotas
> > is
> > > > > > probably
> > > > > > > > the
> > > > > > > > >> simplest. There could be a use case for having separate
> > fields
> > > > for
> > > > > > > each,
> > > > > > > > >> but I think that is actually harder to use/monitor in the
> > > common
> > > > > > case
> > > > > > > so
> > > > > > > > >> unless someone has a use case I think just one should be
> > fine.
> > > > > > > > >>
> > > > > > > > >> -Jay
> > > > > > > > >>
> > > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > > > > > > rajinisivaram@gmail.com>
> > > > > > > > >> wrote:
> > > > > > > > >>
> > > > > > > > >> > I have updated the KIP based on the discussions so far.
> > > > > > > > >> >
> > > > > > > > >> >
> > > > > > > > >> > Regards,
> > > > > > > > >> >
> > > > > > > > >> > Rajini
> > > > > > > > >> >
> > > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > > >> > wrote:
> > > > > > > > >> >
> > > > > > > > >> > > Thank you all for the feedback.
> > > > > > > > >> > >
> > > > > > > > >> > > Ismael #1. It makes sense not to throttle inter-broker
> > > > > requests
> > > > > > > like
> > > > > > > > >> > > LeaderAndIsr etc. The simplest way to ensure that
> > clients
> > > > > cannot
> > > > > > > use
> > > > > > > > >> > these
> > > > > > > > >> > > requests to bypass quotas for DoS attacks is to ensure
> > > that
> > > > > ACLs
> > > > > > > > >> prevent
> > > > > > > > >> > > clients from using these requests and unauthorized
> > > requests
> > > > > are
> > > > > > > > >> included
> > > > > > > > >> > > towards quotas.
> > > > > > > > >> > >
> > > > > > > > >> > > Ismael #2, Jay #1 : I was thinking that these quotas
> can
> > > > > return
> > > > > > a
> > > > > > > > >> > separate
> > > > > > > > >> > > throttle time, and all utilization based quotas could
> > use
> > > > the
> > > > > > same
> > > > > > > > >> field
> > > > > > > > >> > > (we won't add another one for network thread
> utilization
> > > for
> > > > > > > > >> instance).
> > > > > > > > >> > But
> > > > > > > > >> > > perhaps it makes sense to keep byte rate quotas
> separate
> > > in
> > > > > > > > >> produce/fetch
> > > > > > > > >> > > responses to provide separate metrics? Agree with
> Ismael
> > > > that
> > > > > > the
> > > > > > > > >> name of
> > > > > > > > >> > > the existing field should be changed if we have two.
> > Happy
> > > > to
> > > > > > > switch
> > > > > > > > >> to a
> > > > > > > > >> > > single combined throttle time if that is sufficient.
> > > > > > > > >> > >
> > > > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot
> > separated
> > > > > name
> > > > > > > for
> > > > > > > > >> new
> > > > > > > > >> > > property. Replication quotas use dot separated, so it
> > will
> > > > be
> > > > > > > > >> consistent
> > > > > > > > >> > > with all properties except byte rate quotas.
> > > > > > > > >> > >
> > > > > > > > >> > > Radai: #1 Request processing time rather than request
> > rate
> > > > > were
> > > > > > > > chosen
> > > > > > > > >> > > because the time per request can vary significantly
> > > between
> > > > > > > requests
> > > > > > > > >> as
> > > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > > >> > > #2 Two separate quotas for heartbeats/regular requests
> > > feel
> > > > > like
> > > > > > > > more
> > > > > > > > >> > > configuration and more metrics. Since most users would
> > set
> > > > > > quotas
> > > > > > > > >> higher
> > > > > > > > >> > > than the expected usage and quotas are more of a
> safety
> > > > net, a
> > > > > > > > single
> > > > > > > > >> > quota
> > > > > > > > >> > > should work in most cases.
> > > > > > > > >> > >  #3 The number of requests in purgatory is limited by
> > the
> > > > > number
> > > > > > > of
> > > > > > > > >> > active
> > > > > > > > >> > > connections since only one request per connection will
> > be
> > > > > > > throttled
> > > > > > > > >> at a
> > > > > > > > >> > > time.
> > > > > > > > >> > > #4 As with byte rate quotas, to use the full allocated
> > > > quotas,
> > > > > > > > >> > > clients/users would need to use partitions that are
> > > > > distributed
> > > > > > > > across
> > > > > > > > >> > the
> > > > > > > > >> > > cluster. The alternative of using cluster-wide quotas
> > > > instead
> > > > > of
> > > > > > > > >> > per-broker
> > > > > > > > >> > > quotas would be far too complex to implement.
> > > > > > > > >> > >
> > > > > > > > >> > > Dong : We currently have two ClientQuotaManagers for
> > quota
> > > > > types
> > > > > > > > Fetch
> > > > > > > > >> > and
> > > > > > > > >> > > Produce. A new one will be added for IOThread, which
> > > manages
> > > > > > > quotas
> > > > > > > > >> for
> > > > > > > > >> > I/O
> > > > > > > > >> > > thread utilization. This will not update the Fetch or
> > > > Produce
> > > > > > > > >> queue-size,
> > > > > > > > >> > > but will have a separate metric for the queue-size.  I
> > > > wasn't
> > > > > > > > >> planning to
> > > > > > > > >> > > add any additional metrics apart from the equivalent
> > ones
> > > > for
> > > > > > > > existing
> > > > > > > > >> > > quotas as part of this KIP. Ratio of byte-rate to I/O
> > > thread
> > > > > > > > >> utilization
> > > > > > > > >> > > could be slightly misleading since it depends on the
> > > > sequence
> > > > > of
> > > > > > > > >> > requests.
> > > > > > > > >> > > But we can look into more metrics after the KIP is
> > > > implemented
> > > > > > if
> > > > > > > > >> > required.
> > > > > > > > >> > >
> > > > > > > > >> > > I think we need to limit the maximum delay since all
> > > > requests
> > > > > > are
> > > > > > > > >> > > throttled. If a client has a quota of 0.001 units and
> a
> > > > single
> > > > > > > > request
> > > > > > > > >> > used
> > > > > > > > >> > > 50ms, we don't want to delay all requests from the
> > client
> > > by
> > > > > 50
> > > > > > > > >> seconds,
> > > > > > > > >> > > throwing the client out of all its consumer groups.
> The
> > > > issue
> > > > > is
> > > > > > > > only
> > > > > > > > >> if
> > > > > > > > >> > a
> > > > > > > > >> > > user is allocated a quota that is insufficient to
> > process
> > > > one
> > > > > > > large
> > > > > > > > >> > > request. The expectation is that the units allocated
> per
> > > > user
> > > > > > will
> > > > > > > > be
> > > > > > > > >> > much
> > > > > > > > >> > > higher than the time taken to process one request and
> > the
> > > > > limit
> > > > > > > > should
> > > > > > > > >> > > seldom be applied. Agree this needs proper
> > documentation.
> > > > > > > > >> > >
> > > > > > > > >> > > Regards,
> > > > > > > > >> > >
> > > > > > > > >> > > Rajini
> > > > > > > > >> > >
> > > > > > > > >> > >
> > > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > > > radai.rosenblatt@gmail.com>
> > > > > > > > >> > wrote:
> > > > > > > > >> > >
> > > > > > > > >> > >> @jun: i wasnt concerned about tying up a request
> > > processing
> > > > > > > thread,
> > > > > > > > >> but
> > > > > > > > >> > >> IIUC the code does still read the entire request out,
> > > which
> > > > > > might
> > > > > > > > >> add-up
> > > > > > > > >> > >> to
> > > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > > >> > >>
> > > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> > > > > > lindong28@gmail.com>
> > > > > > > > >> wrote:
> > > > > > > > >> > >>
> > > > > > > > >> > >> > Hey Rajini,
> > > > > > > > >> > >> >
> > > > > > > > >> > >> > The current KIP says that the maximum delay will be
> > > > reduced
> > > > > > to
> > > > > > > > >> window
> > > > > > > > >> > >> size
> > > > > > > > >> > >> > if it is larger than the window size. I have a
> > concern
> > > > with
> > > > > > > this:
> > > > > > > > >> > >> >
> > > > > > > > >> > >> > 1) This essentially means that the user is allowed
> to
> > > > > exceed
> > > > > > > > their
> > > > > > > > >> > quota
> > > > > > > > >> > >> > over a long period of time. Can you provide an
> upper
> > > > bound
> > > > > on
> > > > > > > > this
> > > > > > > > >> > >> > deviation?
> > > > > > > > >> > >> >
> > > > > > > > >> > >> > 2) What is the motivation for cap the maximum delay
> > by
> > > > the
> > > > > > > window
> > > > > > > > >> > size?
> > > > > > > > >> > >> I
> > > > > > > > >> > >> > am wondering if there is better alternative to
> > address
> > > > the
> > > > > > > > problem.
> > > > > > > > >> > >> >
> > > > > > > > >> > >> > 3) It means that the existing metric-related config
> > > will
> > > > > > have a
> > > > > > > > >> more
> > > > > > > > >> > >> > directly impact on the mechanism of this
> > > > > io-thread-unit-based
> > > > > > > > >> quota.
> > > > > > > > >> > The
> > > > > > > > >> > >> > may be an important change depending on the answer
> to
> > > 1)
> > > > > > above.
> > > > > > > > We
> > > > > > > > >> > >> probably
> > > > > > > > >> > >> > need to document this more explicitly.
> > > > > > > > >> > >> >
> > > > > > > > >> > >> > Dong
> > > > > > > > >> > >> >
> > > > > > > > >> > >> >
> > > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> > > > > > > lindong28@gmail.com>
> > > > > > > > >> > wrote:
> > > > > > > > >> > >> >
> > > > > > > > >> > >> > > Hey Jun,
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > > Yeah you are right. I thought it wasn't because
> at
> > > > > LinkedIn
> > > > > > > it
> > > > > > > > >> will
> > > > > > > > >> > be
> > > > > > > > >> > >> > too
> > > > > > > > >> > >> > > much pressure on inGraph to expose those
> > per-clientId
> > > > > > metrics
> > > > > > > > so
> > > > > > > > >> we
> > > > > > > > >> > >> ended
> > > > > > > > >> > >> > > up printing them periodically to local log. Never
> > > mind
> > > > if
> > > > > > it
> > > > > > > is
> > > > > > > > >> not
> > > > > > > > >> > a
> > > > > > > > >> > >> > > general problem.
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > > Hey Rajini,
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > > - I agree with Jay that we probably don't want to
> > > add a
> > > > > new
> > > > > > > > field
> > > > > > > > >> > for
> > > > > > > > >> > >> > > every quota ProduceResponse or FetchResponse. Is
> > > there
> > > > > any
> > > > > > > > >> use-case
> > > > > > > > >> > >> for
> > > > > > > > >> > >> > > having separate throttle-time fields for
> > > > byte-rate-quota
> > > > > > and
> > > > > > > > >> > >> > > io-thread-unit-quota? You probably need to
> document
> > > > this
> > > > > as
> > > > > > > > >> > interface
> > > > > > > > >> > >> > > change if you plan to add new field in any
> request.
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > > - I don't think IOThread belongs to quotaType.
> The
> > > > > existing
> > > > > > > > quota
> > > > > > > > >> > >> types
> > > > > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > > > > n/FollowerReplication)
> > > > > > > > >> identify
> > > > > > > > >> > >> the
> > > > > > > > >> > >> > > type of request that are throttled, not the quota
> > > > > mechanism
> > > > > > > > that
> > > > > > > > >> is
> > > > > > > > >> > >> > applied.
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > > - If a request is throttled due to this
> > > > > > io-thread-unit-based
> > > > > > > > >> quota,
> > > > > > > > >> > is
> > > > > > > > >> > >> > the
> > > > > > > > >> > >> > > existing queue-size metric in ClientQuotaManager
> > > > > > incremented?
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > > - In the interest of providing guide line for
> admin
> > > to
> > > > > > decide
> > > > > > > > >> > >> > > io-thread-unit-based quota and for user to
> > understand
> > > > its
> > > > > > > > impact
> > > > > > > > >> on
> > > > > > > > >> > >> their
> > > > > > > > >> > >> > > traffic, would it be useful to have a metric that
> > > shows
> > > > > the
> > > > > > > > >> overall
> > > > > > > > >> > >> > > byte-rate per io-thread-unit? Can we also show
> > this a
> > > > > > > > >> per-clientId
> > > > > > > > >> > >> > metric?
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > > Thanks,
> > > > > > > > >> > >> > > Dong
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <
> > > > > jun@confluent.io
> > > > > > >
> > > > > > > > >> wrote:
> > > > > > > > >> > >> > >
> > > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > > >> > >> > >>
> > > > > > > > >> > >> > >> For #3, typically, an admin won't configure more
> > io
> > > > > > threads
> > > > > > > > than
> > > > > > > > >> > CPU
> > > > > > > > >> > >> > >> cores,
> > > > > > > > >> > >> > >> but it's possible for an admin to start with
> fewer
> > > io
> > > > > > > threads
> > > > > > > > >> than
> > > > > > > > >> > >> cores
> > > > > > > > >> > >> > >> and grow that later on.
> > > > > > > > >> > >> > >>
> > > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > > >> > >> > >>
> > > > > > > > >> > >> > >> I think the throttleTime sensor on the broker
> > tells
> > > > the
> > > > > > > admin
> > > > > > > > >> > >> whether a
> > > > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > > > >> > >> > >>
> > > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > > >> > >> > >>
> > > > > > > > >> > >> > >> The reasoning for delaying the throttled
> requests
> > on
> > > > the
> > > > > > > > broker
> > > > > > > > >> > >> instead
> > > > > > > > >> > >> > of
> > > > > > > > >> > >> > >> returning an error immediately is that the
> latter
> > > has
> > > > no
> > > > > > way
> > > > > > > > to
> > > > > > > > >> > >> prevent
> > > > > > > > >> > >> > >> the
> > > > > > > > >> > >> > >> client from retrying immediately, which will
> make
> > > > things
> > > > > > > > worse.
> > > > > > > > >> The
> > > > > > > > >> > >> > >> delaying logic is based off a delay queue. A
> > > separate
> > > > > > > > expiration
> > > > > > > > >> > >> thread
> > > > > > > > >> > >> > >> just waits on the next to be expired request.
> So,
> > it
> > > > > > doesn't
> > > > > > > > tie
> > > > > > > > >> > up a
> > > > > > > > >> > >> > >> request handler thread.
> > > > > > > > >> > >> > >>
> > > > > > > > >> > >> > >> Thanks,
> > > > > > > > >> > >> > >>
> > > > > > > > >> > >> > >> Jun
> > > > > > > > >> > >> > >>
> > > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> > > > > > > > ismael@juma.me.uk
> > > > > > > > >> >
> > > > > > > > >> > >> wrote:
> > > > > > > > >> > >> > >>
> > > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > > >> > >> > >> >
> > > > > > > > >> > >> > >> > Regarding 1, I definitely like the simplicity
> of
> > > > > > keeping a
> > > > > > > > >> single
> > > > > > > > >> > >> > >> throttle
> > > > > > > > >> > >> > >> > time field in the response. The downside is
> that
> > > the
> > > > > > > client
> > > > > > > > >> > metrics
> > > > > > > > >> > >> > >> will be
> > > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > > >> > >> > >> >
> > > > > > > > >> > >> > >> > Regarding 3, we have
> > `leader.imbalance.per.broker.
> > > > > > > > percentage`
> > > > > > > > >> > and
> > > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > > > > >> > >> > >> >
> > > > > > > > >> > >> > >> > Ismael
> > > > > > > > >> > >> > >> >
> > > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> > > > > > > > jay@confluent.io>
> > > > > > > > >> > >> wrote:
> > > > > > > > >> > >> > >> >
> > > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > > >> > >> > >> > >
> > > > > > > > >> > >> > >> > >    1. Isn't it the case that the throttling
> > time
> > > > > > > response
> > > > > > > > >> field
> > > > > > > > >> > >> > should
> > > > > > > > >> > >> > >> > have
> > > > > > > > >> > >> > >> > >    the total time your request was throttled
> > > > > > > irrespective
> > > > > > > > of
> > > > > > > > >> > the
> > > > > > > > >> > >> > >> quotas
> > > > > > > > >> > >> > >> > > that
> > > > > > > > >> > >> > >> > >    caused that. Limiting it to byte rate
> quota
> > > > > doesn't
> > > > > > > > make
> > > > > > > > >> > >> sense,
> > > > > > > > >> > >> > >> but I
> > > > > > > > >> > >> > >> > > also
> > > > > > > > >> > >> > >> > >    I don't think we want to end up adding
> new
> > > > fields
> > > > > > in
> > > > > > > > the
> > > > > > > > >> > >> response
> > > > > > > > >> > >> > >> for
> > > > > > > > >> > >> > >> > > every
> > > > > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > > > > >> > >> > >> > >    2. I don't think we should make this
> quota
> > > > > > > specifically
> > > > > > > > >> > about
> > > > > > > > >> > >> io
> > > > > > > > >> > >> > >> > >    threads. Once we introduce these quotas
> > > people
> > > > > set
> > > > > > > them
> > > > > > > > >> and
> > > > > > > > >> > >> > expect
> > > > > > > > >> > >> > >> > them
> > > > > > > > >> > >> > >> > > to
> > > > > > > > >> > >> > >> > >    be enforced (and if they aren't it may
> > cause
> > > an
> > > > > > > > outage).
> > > > > > > > >> As
> > > > > > > > >> > a
> > > > > > > > >> > >> > >> result
> > > > > > > > >> > >> > >> > > they
> > > > > > > > >> > >> > >> > >    are a bit more sensitive than normal
> > > configs, I
> > > > > > > think.
> > > > > > > > >> The
> > > > > > > > >> > >> > current
> > > > > > > > >> > >> > >> > > thread
> > > > > > > > >> > >> > >> > >    pools seem like something of an
> > > implementation
> > > > > > detail
> > > > > > > > and
> > > > > > > > >> > not
> > > > > > > > >> > >> the
> > > > > > > > >> > >> > >> > level
> > > > > > > > >> > >> > >> > > the
> > > > > > > > >> > >> > >> > >    user-facing quotas should be involved
> > with. I
> > > > > think
> > > > > > > it
> > > > > > > > >> might
> > > > > > > > >> > >> be
> > > > > > > > >> > >> > >> better
> > > > > > > > >> > >> > >> > > to
> > > > > > > > >> > >> > >> > >    make this a general request-time throttle
> > > with
> > > > no
> > > > > > > > >> mention in
> > > > > > > > >> > >> the
> > > > > > > > >> > >> > >> > naming
> > > > > > > > >> > >> > >> > >    about I/O threads and simply acknowledge
> > the
> > > > > > current
> > > > > > > > >> > >> limitation
> > > > > > > > >> > >> > >> (which
> > > > > > > > >> > >> > >> > > we
> > > > > > > > >> > >> > >> > >    may someday fix) in the docs that this
> > covers
> > > > > only
> > > > > > > the
> > > > > > > > >> time
> > > > > > > > >> > >> after
> > > > > > > > >> > >> > >> the
> > > > > > > > >> > >> > >> > >    thread is read off the network.
> > > > > > > > >> > >> > >> > >    3. As such I think the right interface to
> > the
> > > > > user
> > > > > > > > would
> > > > > > > > >> be
> > > > > > > > >> > >> > >> something
> > > > > > > > >> > >> > >> > >    like percent_request_time and be in
> > > {0,...100}
> > > > or
> > > > > > > > >> > >> > >> request_time_ratio
> > > > > > > > >> > >> > >> > > and be
> > > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the
> > > > > > terminology
> > > > > > > we
> > > > > > > > >> used
> > > > > > > > >> > >> if
> > > > > > > > >> > >> > the
> > > > > > > > >> > >> > >> > > scale
> > > > > > > > >> > >> > >> > >    is between 0 and 1 in the other metrics,
> > > > right?)
> > > > > > > > >> > >> > >> > >
> > > > > > > > >> > >> > >> > > -Jay
> > > > > > > > >> > >> > >> > >
> > > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini
> > Sivaram
> > > <
> > > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > > >> > >> > >> > >
> > > > > > > > >> > >> > >> > > wrote:
> > > > > > > > >> > >> > >> > >
> > > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > > >> > >> > >> > > >
> > > > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > > > >> > >> > >> > > >
> > > > > > > > >> > >> > >> > > > Guozhang : I have updated the section on
> > > > > > co-existence
> > > > > > > of
> > > > > > > > >> byte
> > > > > > > > >> > >> rate
> > > > > > > > >> > >> > >> and
> > > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > > >> > >> > >> > > >
> > > > > > > > >> > >> > >> > > > Dong: I hadn't added much detail to the
> > > metrics
> > > > > and
> > > > > > > > >> sensors
> > > > > > > > >> > >> since
> > > > > > > > >> > >> > >> they
> > > > > > > > >> > >> > >> > > are
> > > > > > > > >> > >> > >> > > > going to be very similar to the existing
> > > metrics
> > > > > and
> > > > > > > > >> sensors.
> > > > > > > > >> > >> To
> > > > > > > > >> > >> > >> avoid
> > > > > > > > >> > >> > >> > > > confusion, I have now added more detail.
> All
> > > > > metrics
> > > > > > > are
> > > > > > > > >> in
> > > > > > > > >> > the
> > > > > > > > >> > >> > >> group
> > > > > > > > >> > >> > >> > > > "quotaType" and all sensors have names
> > > starting
> > > > > with
> > > > > > > > >> > >> "quotaType"
> > > > > > > > >> > >> > >> (where
> > > > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/
> > LeaderReplication/
> > > > > > > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > > > > > > >> > >> > >> > > > So there will be no reuse of existing
> > > > > > metrics/sensors.
> > > > > > > > The
> > > > > > > > >> > new
> > > > > > > > >> > >> > ones
> > > > > > > > >> > >> > >> for
> > > > > > > > >> > >> > >> > > > request processing time based throttling
> > will
> > > be
> > > > > > > > >> completely
> > > > > > > > >> > >> > >> independent
> > > > > > > > >> > >> > >> > > of
> > > > > > > > >> > >> > >> > > > existing metrics/sensors, but will be
> > > consistent
> > > > > in
> > > > > > > > >> format.
> > > > > > > > >> > >> > >> > > >
> > > > > > > > >> > >> > >> > > > The existing throttle_time_ms field in
> > > > > produce/fetch
> > > > > > > > >> > responses
> > > > > > > > >> > >> > will
> > > > > > > > >> > >> > >> not
> > > > > > > > >> > >> > >> > > be
> > > > > > > > >> > >> > >> > > > impacted by this KIP. That will continue
> to
> > > > return
> > > > > > > > >> byte-rate
> > > > > > > > >> > >> based
> > > > > > > > >> > >> > >> > > > throttling times. In addition, a new field
> > > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > > >> > >> > >> > will
> > > > > > > > >> > >> > >> > > be
> > > > > > > > >> > >> > >> > > > added to return request quota based
> > throttling
> > > > > > times.
> > > > > > > > >> These
> > > > > > > > >> > >> will
> > > > > > > > >> > >> > be
> > > > > > > > >> > >> > >> > > exposed
> > > > > > > > >> > >> > >> > > > as new metrics on the client-side.
> > > > > > > > >> > >> > >> > > >
> > > > > > > > >> > >> > >> > > > Since all metrics and sensors are
> different
> > > for
> > > > > each
> > > > > > > > type
> > > > > > > > >> of
> > > > > > > > >> > >> > quota,
> > > > > > > > >> > >> > >> I
> > > > > > > > >> > >> > >> > > > believe there is already sufficient
> metrics
> > to
> > > > > > monitor
> > > > > > > > >> > >> throttling
> > > > > > > > >> > >> > on
> > > > > > > > >> > >> > >> > both
> > > > > > > > >> > >> > >> > > > client and broker side for each type of
> > > > > throttling.
> > > > > > > > >> > >> > >> > > >
> > > > > > > > >> > >> > >> > > > Regards,
> > > > > > > > >> > >> > >> > > >
> > > > > > > > >> > >> > >> > > > Rajini
> > > > > > > > >> > >> > >> > > >
> > > > > > > > >> > >> > >> > > >
> > > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin
> <
> > > > > > > > >> > lindong28@gmail.com
> > > > > > > > >> > >> >
> > > > > > > > >> > >> > >> wrote:
> > > > > > > > >> > >> > >> > > >
> > > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > > I think it makes a lot of sense to use
> > > > > > > io_thread_units
> > > > > > > > >> as
> > > > > > > > >> > >> metric
> > > > > > > > >> > >> > >> to
> > > > > > > > >> > >> > >> > > quota
> > > > > > > > >> > >> > >> > > > > user's traffic here. LGTM overall. I
> have
> > > some
> > > > > > > > questions
> > > > > > > > >> > >> > regarding
> > > > > > > > >> > >> > >> > > > sensors.
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > > - Can you be more specific in the KIP
> what
> > > > > sensors
> > > > > > > > will
> > > > > > > > >> be
> > > > > > > > >> > >> > added?
> > > > > > > > >> > >> > >> For
> > > > > > > > >> > >> > >> > > > > example, it will be useful to specify
> the
> > > name
> > > > > and
> > > > > > > > >> > >> attributes of
> > > > > > > > >> > >> > >> > these
> > > > > > > > >> > >> > >> > > > new
> > > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > > - We currently have throttle-time and
> > > > queue-size
> > > > > > for
> > > > > > > > >> > >> byte-rate
> > > > > > > > >> > >> > >> based
> > > > > > > > >> > >> > >> > > > quota.
> > > > > > > > >> > >> > >> > > > > Are you going to have separate
> > throttle-time
> > > > and
> > > > > > > > >> queue-size
> > > > > > > > >> > >> for
> > > > > > > > >> > >> > >> > > requests
> > > > > > > > >> > >> > >> > > > > throttled by io_thread_unit-based quota,
> > or
> > > > will
> > > > > > > they
> > > > > > > > >> share
> > > > > > > > >> > >> the
> > > > > > > > >> > >> > >> same
> > > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > > - Does the throttle-time in the
> > > > ProduceResponse
> > > > > > and
> > > > > > > > >> > >> > FetchResponse
> > > > > > > > >> > >> > >> > > > contains
> > > > > > > > >> > >> > >> > > > > time due to io_thread_unit-based quota?
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > > - Currently kafka server doesn't not
> > provide
> > > > any
> > > > > > log
> > > > > > > > or
> > > > > > > > >> > >> metrics
> > > > > > > > >> > >> > >> that
> > > > > > > > >> > >> > >> > > > tells
> > > > > > > > >> > >> > >> > > > > whether any given clientId (or user) is
> > > > > throttled.
> > > > > > > > This
> > > > > > > > >> is
> > > > > > > > >> > >> not
> > > > > > > > >> > >> > too
> > > > > > > > >> > >> > >> > bad
> > > > > > > > >> > >> > >> > > > > because we can still check the
> client-side
> > > > > > byte-rate
> > > > > > > > >> metric
> > > > > > > > >> > >> to
> > > > > > > > >> > >> > >> > validate
> > > > > > > > >> > >> > >> > > > > whether a given client is throttled. But
> > > with
> > > > > this
> > > > > > > > >> > >> > io_thread_unit,
> > > > > > > > >> > >> > >> > > there
> > > > > > > > >> > >> > >> > > > > will be no way to validate whether a
> given
> > > > > client
> > > > > > is
> > > > > > > > >> slow
> > > > > > > > >> > >> > because
> > > > > > > > >> > >> > >> it
> > > > > > > > >> > >> > >> > > has
> > > > > > > > >> > >> > >> > > > > exceeded its io_thread_unit limit. It is
> > > > > necessary
> > > > > > > for
> > > > > > > > >> user
> > > > > > > > >> > >> to
> > > > > > > > >> > >> > be
> > > > > > > > >> > >> > >> > able
> > > > > > > > >> > >> > >> > > to
> > > > > > > > >> > >> > >> > > > > know this information to figure how
> > whether
> > > > they
> > > > > > > have
> > > > > > > > >> > reached
> > > > > > > > >> > >> > >> there
> > > > > > > > >> > >> > >> > > quota
> > > > > > > > >> > >> > >> > > > > limit. How about we add log4j log on the
> > > > server
> > > > > > side
> > > > > > > > to
> > > > > > > > >> > >> > >> periodically
> > > > > > > > >> > >> > >> > > > print
> > > > > > > > >> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > > >> > >> > >> > > so
> > > > > > > > >> > >> > >> > > > > that kafka administrator can figure
> those
> > > > users
> > > > > > that
> > > > > > > > >> have
> > > > > > > > >> > >> > reached
> > > > > > > > >> > >> > >> > their
> > > > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > > >> > >> > >> > > > > Dong
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM,
> Guozhang
> > > > Wang <
> > > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >> > >> > >> > > > > > Made a pass over the doc, overall LGTM
> > > > except
> > > > > a
> > > > > > > > minor
> > > > > > > > >> > >> comment
> > > > > > > > >> > >> > on
> > > > > > > > >> > >> > >> > the
> > > > > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > > > > >> > >> > >> > > > > >
> > > > > > > > >> > >> > >> > > > > > Stated as "Request processing time
> > > > throttling
> > > > > > will
> > > > > > > > be
> > > > > > > > >> > >> applied
> > > > > > > > >> > >> > on
> > > > > > > > >> > >> > >> > top
> > > > > > > > >> > >> > >> > > if
> > > > > > > > >> > >> > >> > > > > > necessary." I thought that it meant
> the
> > > > > request
> > > > > > > > >> > processing
> > > > > > > > >> > >> > time
> > > > > > > > >> > >> > >> > > > > throttling
> > > > > > > > >> > >> > >> > > > > > is applied first, but continue
> reading I
> > > > found
> > > > > > it
> > > > > > > > >> > actually
> > > > > > > > >> > >> > >> meant to
> > > > > > > > >> > >> > >> > > > apply
> > > > > > > > >> > >> > >> > > > > > produce / fetch byte rate throttling
> > > first.
> > > > > > > > >> > >> > >> > > > > >
> > > > > > > > >> > >> > >> > > > > > Also the last sentence "The remaining
> > > delay
> > > > if
> > > > > > any
> > > > > > > > is
> > > > > > > > >> > >> applied
> > > > > > > > >> > >> > to
> > > > > > > > >> > >> > >> > the
> > > > > > > > >> > >> > >> > > > > > response." is a bit confusing to me.
> > Maybe
> > > > > > > rewording
> > > > > > > > >> it a
> > > > > > > > >> > >> bit?
> > > > > > > > >> > >> > >> > > > > >
> > > > > > > > >> > >> > >> > > > > >
> > > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > > >> > >> > >> > > > > >
> > > > > > > > >> > >> > >> > > > > >
> > > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun
> > Rao <
> > > > > > > > >> > jun@confluent.io
> > > > > > > > >> > >> >
> > > > > > > > >> > >> > >> wrote:
> > > > > > > > >> > >> > >> > > > > >
> > > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > >> > >> > >> > > > > > > Thanks for the updated KIP. The
> latest
> > > > > > proposal
> > > > > > > > >> looks
> > > > > > > > >> > >> good
> > > > > > > > >> > >> > to
> > > > > > > > >> > >> > >> me.
> > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM,
> > Rajini
> > > > > > Sivaram
> > > > > > > <
> > > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > >> > >> > >> > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > Thank you for the feedback.
> > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > 1. I have updated the KIP to use
> > > > absolute
> > > > > > > units
> > > > > > > > >> > >> instead of
> > > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > > >> > >> > >> > > > > > > The
> > > > > > > > >> > >> > >> > > > > > > > property is called*
> io_thread_units*
> > > to
> > > > > > align
> > > > > > > > with
> > > > > > > > >> > the
> > > > > > > > >> > >> > >> thread
> > > > > > > > >> > >> > >> > > count
> > > > > > > > >> > >> > >> > > > > > > > property *num.io.threads*. When we
> > > > > implement
> > > > > > > > >> network
> > > > > > > > >> > >> > thread
> > > > > > > > >> > >> > >> > > > > utilization
> > > > > > > > >> > >> > >> > > > > > > > quotas, we can add another
> property
> > > > > > > > >> > >> > *network_thread_units.*
> > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is already
> > > listed
> > > > > > under
> > > > > > > > the
> > > > > > > > >> > >> exempt
> > > > > > > > >> > >> > >> > > requests.
> > > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > > >> > >> > >> > > > > > > did
> > > > > > > > >> > >> > >> > > > > > > > you mean a different request that
> > > needs
> > > > to
> > > > > > be
> > > > > > > > >> added?
> > > > > > > > >> > >> The
> > > > > > > > >> > >> > >> four
> > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > >> > >> > >> > > > > > > > currently exempt in the KIP are
> > > > > StopReplica,
> > > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata.
> > These
> > > > are
> > > > > > > > >> controlled
> > > > > > > > >> > >> > using
> > > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy to exclude and
> > only
> > > > > > > throttle
> > > > > > > > if
> > > > > > > > >> > >> > >> > unauthorized.
> > > > > > > > >> > >> > >> > > I
> > > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > > >> > >> > >> > > > > > > > sure if there are other requests
> > used
> > > > only
> > > > > > for
> > > > > > > > >> > >> > inter-broker
> > > > > > > > >> > >> > >> > that
> > > > > > > > >> > >> > >> > > > > needed
> > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > 3. I was thinking the smallest
> > change
> > > > > would
> > > > > > be
> > > > > > > > to
> > > > > > > > >> > >> replace
> > > > > > > > >> > >> > >> all
> > > > > > > > >> > >> > >> > > > > > references
> > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > *requestChannel.sendResponse()*
> > with
> > > a
> > > > > > local
> > > > > > > > >> method
> > > > > > > > >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that
> > > does
> > > > > the
> > > > > > > > >> > throttling
> > > > > > > > >> > >> if
> > > > > > > > >> > >> > >> any
> > > > > > > > >> > >> > >> > > plus
> > > > > > > > >> > >> > >> > > > > send
> > > > > > > > >> > >> > >> > > > > > > > response. If we throttle first in
> > > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > > >> > >> > the
> > > > > > > > >> > >> > >> > time
> > > > > > > > >> > >> > >> > > > > spent
> > > > > > > > >> > >> > >> > > > > > > > within the method handling the
> > request
> > > > > will
> > > > > > > not
> > > > > > > > be
> > > > > > > > >> > >> > recorded
> > > > > > > > >> > >> > >> or
> > > > > > > > >> > >> > >> > > used
> > > > > > > > >> > >> > >> > > > > in
> > > > > > > > >> > >> > >> > > > > > > > throttling. We can look into this
> > > again
> > > > > when
> > > > > > > the
> > > > > > > > >> PR
> > > > > > > > >> > is
> > > > > > > > >> > >> > ready
> > > > > > > > >> > >> > >> > for
> > > > > > > > >> > >> > >> > > > > > review.
> > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM,
> > Roger
> > > > > > Hoover
> > > > > > > <
> > > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > > >> > >> > >> > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > Great to see this KIP and the
> > > > excellent
> > > > > > > > >> discussion.
> > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > To me, Jun's suggestion makes
> > sense.
> > > > If
> > > > > > my
> > > > > > > > >> > >> application
> > > > > > > > >> > >> > is
> > > > > > > > >> > >> > >> > > > > allocated
> > > > > > > > >> > >> > >> > > > > > 1
> > > > > > > > >> > >> > >> > > > > > > > > request handler unit, then it's
> as
> > > if
> > > > I
> > > > > > > have a
> > > > > > > > >> > Kafka
> > > > > > > > >> > >> > >> broker
> > > > > > > > >> > >> > >> > > with
> > > > > > > > >> > >> > >> > > > a
> > > > > > > > >> > >> > >> > > > > > > single
> > > > > > > > >> > >> > >> > > > > > > > > request handler thread dedicated
> > to
> > > > me.
> > > > > > > > That's
> > > > > > > > >> the
> > > > > > > > >> > >> > most I
> > > > > > > > >> > >> > >> > can
> > > > > > > > >> > >> > >> > > > use,
> > > > > > > > >> > >> > >> > > > > > at
> > > > > > > > >> > >> > >> > > > > > > > > least.  That allocation doesn't
> > > change
> > > > > > even
> > > > > > > if
> > > > > > > > >> an
> > > > > > > > >> > >> admin
> > > > > > > > >> > >> > >> later
> > > > > > > > >> > >> > >> > > > > > increases
> > > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > size of the request thread pool
> on
> > > the
> > > > > > > broker.
> > > > > > > > >> > It's
> > > > > > > > >> > >> > >> similar
> > > > > > > > >> > >> > >> > to
> > > > > > > > >> > >> > >> > > > the
> > > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > > >> > >> > >> > > > > > > > > abstraction that VMs and
> > containers
> > > > get
> > > > > > from
> > > > > > > > >> > >> hypervisors
> > > > > > > > >> > >> > >> or
> > > > > > > > >> > >> > >> > OS
> > > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > > >> > >> > >> > > > > > > > > While different client access
> > > patterns
> > > > > can
> > > > > > > use
> > > > > > > > >> > wildly
> > > > > > > > >> > >> > >> > different
> > > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > >> > >> > >> > > > > > > > > request thread resources per
> > > request,
> > > > a
> > > > > > > given
> > > > > > > > >> > >> > application
> > > > > > > > >> > >> > >> > will
> > > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > > >> > >> > >> > > > > > > > > have a stable access pattern and
> > can
> > > > > > figure
> > > > > > > > out
> > > > > > > > >> > >> > >> empirically
> > > > > > > > >> > >> > >> > how
> > > > > > > > >> > >> > >> > > > > many
> > > > > > > > >> > >> > >> > > > > > > > > "request thread units" it needs
> to
> > > > meet
> > > > > > it's
> > > > > > > > >> > >> > >> > throughput/latency
> > > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM,
> > Jun
> > > > > Rao <
> > > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > > >> > >> > >> > > > wrote:
> > > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > Thanks for the updated KIP. A
> > few
> > > > more
> > > > > > > > >> comments.
> > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> > > request_time_percent
> > > > > is
> > > > > > > that
> > > > > > > > >> it's
> > > > > > > > >> > >> not
> > > > > > > > >> > >> > an
> > > > > > > > >> > >> > >> > > > absolute
> > > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > > >> > >> > >> > > > > > > > > > Let's say you give a user a
> 10%
> > > > limit.
> > > > > > If
> > > > > > > > the
> > > > > > > > >> > admin
> > > > > > > > >> > >> > >> doubles
> > > > > > > > >> > >> > >> > > the
> > > > > > > > >> > >> > >> > > > > > > number
> > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > >> > >> > >> > > > > > > > > > request handler threads, that
> > user
> > > > now
> > > > > > > > >> actually
> > > > > > > > >> > has
> > > > > > > > >> > >> > >> twice
> > > > > > > > >> > >> > >> > the
> > > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > > >> > >> > >> > > > > > > > > > capacity. This may confuse
> > people
> > > a
> > > > > bit.
> > > > > > > So,
> > > > > > > > >> > >> perhaps
> > > > > > > > >> > >> > >> > setting
> > > > > > > > >> > >> > >> > > > the
> > > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > > >> > >> > >> > > > > > > > > > based on an absolute request
> > > thread
> > > > > unit
> > > > > > > is
> > > > > > > > >> > better.
> > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest
> is
> > > also
> > > > > an
> > > > > > > > >> > >> inter-broker
> > > > > > > > >> > >> > >> > request
> > > > > > > > >> > >> > >> > > > and
> > > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > > be excluded from throttling.
> > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > 3. Implementation wise, I am
> > > > wondering
> > > > > > if
> > > > > > > > it's
> > > > > > > > >> > >> simpler
> > > > > > > > >> > >> > >> to
> > > > > > > > >> > >> > >> > > apply
> > > > > > > > >> > >> > >> > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > >> > >> > >> > > > > > > > > > time throttling first in
> > > > > > > KafkaApis.handle().
> > > > > > > > >> > >> > Otherwise,
> > > > > > > > >> > >> > >> we
> > > > > > > > >> > >> > >> > > will
> > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > >> > >> > >> > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > > >> > >> > >> > > > > > > > > > the throttling logic in each
> > type
> > > of
> > > > > > > > request.
> > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58
> AM,
> > > > > Rajini
> > > > > > > > >> Sivaram <
> > > > > > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > Thank you for the review.
> > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > I have reverted to the
> > original
> > > > KIP
> > > > > > that
> > > > > > > > >> > >> throttles
> > > > > > > > >> > >> > >> based
> > > > > > > > >> > >> > >> > on
> > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > > >> > >> > >> > > > > > > > > > > utilization. At the moment,
> it
> > > > uses
> > > > > > > > >> percentage,
> > > > > > > > >> > >> but
> > > > > > > > >> > >> > I
> > > > > > > > >> > >> > >> am
> > > > > > > > >> > >> > >> > > > happy
> > > > > > > > >> > >> > >> > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > > > a fraction (out of 1 instead
> > of
> > > > 100)
> > > > > > if
> > > > > > > > >> > >> required. I
> > > > > > > > >> > >> > >> have
> > > > > > > > >> > >> > >> > > > added
> > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > > >> > >> > >> > > > > > > > > > > from this discussion to the
> > KIP.
> > > > > Also
> > > > > > > > added
> > > > > > > > >> a
> > > > > > > > >> > >> > "Future
> > > > > > > > >> > >> > >> > Work"
> > > > > > > > >> > >> > >> > > > > > section
> > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > > > address network thread
> > > > utilization.
> > > > > > The
> > > > > > > > >> > >> > configuration
> > > > > > > > >> > >> > >> is
> > > > > > > > >> > >> > >> > > > named
> > > > > > > > >> > >> > >> > > > > > > > > > > "request_time_percent" with
> > the
> > > > > > > > expectation
> > > > > > > > >> > that
> > > > > > > > >> > >> it
> > > > > > > > >> > >> > >> can
> > > > > > > > >> > >> > >> > > also
> > > > > > > > >> > >> > >> > > > be
> > > > > > > > >> > >> > >> > > > > > > used
> > > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > limit for network thread
> > > > utilization
> > > > > > > when
> > > > > > > > >> that
> > > > > > > > >> > is
> > > > > > > > >> > >> > >> > > > implemented,
> > > > > > > > >> > >> > >> > > > > so
> > > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > > >> > >> > >> > > > > > > > > > > users have to set only one
> > > config
> > > > > for
> > > > > > > the
> > > > > > > > >> two
> > > > > > > > >> > and
> > > > > > > > >> > >> > not
> > > > > > > > >> > >> > >> > have
> > > > > > > > >> > >> > >> > > to
> > > > > > > > >> > >> > >> > > > > > worry
> > > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > > >> > >> > >> > > > > > > > > > > the internal distribution of
> > the
> > > > > work
> > > > > > > > >> between
> > > > > > > > >> > the
> > > > > > > > >> > >> > two
> > > > > > > > >> > >> > >> > > thread
> > > > > > > > >> > >> > >> > > > > > pools
> > > > > > > > >> > >> > >> > > > > > > in
> > > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at
> 12:23
> > > AM,
> > > > > Jun
> > > > > > > Rao
> > > > > > > > <
> > > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > The benefit of using the
> > > request
> > > > > > > > >> processing
> > > > > > > > >> > >> time
> > > > > > > > >> > >> > >> over
> > > > > > > > >> > >> > >> > the
> > > > > > > > >> > >> > >> > > > > > request
> > > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > >> > >> > >> > > > > > > > > > > > exactly what people have
> > > said. I
> > > > > > will
> > > > > > > > just
> > > > > > > > >> > >> expand
> > > > > > > > >> > >> > >> that
> > > > > > > > >> > >> > >> > a
> > > > > > > > >> > >> > >> > > > bit.
> > > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > > following case. The
> producer
> > > > > sends a
> > > > > > > > >> produce
> > > > > > > > >> > >> > request
> > > > > > > > >> > >> > >> > > with a
> > > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > > >> > >> > >> > > > > > > > > > > > but compressed to 100KB
> with
> > > > gzip.
> > > > > > The
> > > > > > > > >> > >> > >> decompression of
> > > > > > > > >> > >> > >> > > the
> > > > > > > > >> > >> > >> > > > > > > message
> > > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > > broker could take 10-15
> > > seconds,
> > > > > > > during
> > > > > > > > >> which
> > > > > > > > >> > >> > time,
> > > > > > > > >> > >> > >> a
> > > > > > > > >> > >> > >> > > > request
> > > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > > >> > >> > >> > > > > > > > > > > > thread is completely
> > blocked.
> > > In
> > > > > > this
> > > > > > > > >> case,
> > > > > > > > >> > >> > neither
> > > > > > > > >> > >> > >> the
> > > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > > >> > >> > >> > > > > > > > > > > > the request rate quota may
> > be
> > > > > > > effective
> > > > > > > > in
> > > > > > > > >> > >> > >> protecting
> > > > > > > > >> > >> > >> > the
> > > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > > >> > >> > >> > > > > > > > > > > > another case. A consumer
> > group
> > > > > > starts
> > > > > > > > >> with 10
> > > > > > > > >> > >> > >> instances
> > > > > > > > >> > >> > >> > > and
> > > > > > > > >> > >> > >> > > > > > later
> > > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > > >> > >> > >> > > > > > > > > > > > switches to 20 instances.
> > The
> > > > > > request
> > > > > > > > rate
> > > > > > > > >> > will
> > > > > > > > >> > >> > >> likely
> > > > > > > > >> > >> > >> > > > > double,
> > > > > > > > >> > >> > >> > > > > > > but
> > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > > actually load on the
> broker
> > > may
> > > > > not
> > > > > > > > double
> > > > > > > > >> > >> since
> > > > > > > > >> > >> > >> each
> > > > > > > > >> > >> > >> > > fetch
> > > > > > > > >> > >> > >> > > > > > > request
> > > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > > >> > >> > >> > > > > > > > > > > > contains half of the
> > > partitions.
> > > > > > > Request
> > > > > > > > >> rate
> > > > > > > > >> > >> > quota
> > > > > > > > >> > >> > >> may
> > > > > > > > >> > >> > >> > > not
> > > > > > > > >> > >> > >> > > > > be
> > > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > > > > configure in this case.
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > What we really want is to
> be
> > > > able
> > > > > to
> > > > > > > > >> prevent
> > > > > > > > >> > a
> > > > > > > > >> > >> > >> client
> > > > > > > > >> > >> > >> > > from
> > > > > > > > >> > >> > >> > > > > > using
> > > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > > >> > >> > >> > > > > > > > > > > > of the server side
> > resources.
> > > In
> > > > > > this
> > > > > > > > >> > >> particular
> > > > > > > > >> > >> > >> KIP,
> > > > > > > > >> > >> > >> > > this
> > > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > > capacity of the request
> > > handler
> > > > > > > > threads. I
> > > > > > > > >> > >> agree
> > > > > > > > >> > >> > >> that
> > > > > > > > >> > >> > >> > it
> > > > > > > > >> > >> > >> > > > may
> > > > > > > > >> > >> > >> > > > > > not
> > > > > > > > >> > >> > >> > > > > > > be
> > > > > > > > >> > >> > >> > > > > > > > > > > > intuitive for the users to
> > > > > determine
> > > > > > > how
> > > > > > > > >> to
> > > > > > > > >> > set
> > > > > > > > >> > >> > the
> > > > > > > > >> > >> > >> > right
> > > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > > >> > >> > >> > > > > > > > > > > > this is not completely new
> > and
> > > > has
> > > > > > > been
> > > > > > > > >> done
> > > > > > > > >> > in
> > > > > > > > >> > >> > the
> > > > > > > > >> > >> > >> > > > container
> > > > > > > > >> > >> > >> > > > > > > world
> > > > > > > > >> > >> > >> > > > > > > > > > > > already. For example,
> Linux
> > > > > cgroup (
> > > > > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > documentation/en-US/Red_Hat_En
> > > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > Resource_Management_Guide/sec-
> > > > > > > cpu.html)
> > > > > > > > >> has
> > > > > > > > >> > >> the
> > > > > > > > >> > >> > >> > concept
> > > > > > > > >> > >> > >> > > of
> > > > > > > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > >> > >> > >> > > > > > > > > > > > which specifies the total
> > > amount
> > > > > of
> > > > > > > time
> > > > > > > > >> in
> > > > > > > > >> > >> > >> > microseconds
> > > > > > > > >> > >> > >> > > > for
> > > > > > > > >> > >> > >> > > > > > > which
> > > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup can run
> > > > during a
> > > > > > one
> > > > > > > > >> second
> > > > > > > > >> > >> > >> period.
> > > > > > > > >> > >> > >> > We
> > > > > > > > >> > >> > >> > > > can
> > > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > > >> > >> > >> > > > > > > > > > > > model the request handler
> > > > threads
> > > > > > in a
> > > > > > > > >> > similar
> > > > > > > > >> > >> > way.
> > > > > > > > >> > >> > >> For
> > > > > > > > >> > >> > >> > > > > > example,
> > > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > > >> > >> > >> > > > > > > > > > > > request handler thread can
> > be
> > > 1
> > > > > > > request
> > > > > > > > >> > handler
> > > > > > > > >> > >> > unit
> > > > > > > > >> > >> > >> > and
> > > > > > > > >> > >> > >> > > > the
> > > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > > >> > >> > >> > > > > > > > > > > > configure a limit on how
> > many
> > > > > units
> > > > > > > (say
> > > > > > > > >> > 0.01)
> > > > > > > > >> > >> a
> > > > > > > > >> > >> > >> client
> > > > > > > > >> > >> > >> > > can
> > > > > > > > >> > >> > >> > > > > > have.
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > Regarding not throttling
> the
> > > > > > internal
> > > > > > > > >> broker
> > > > > > > > >> > to
> > > > > > > > >> > >> > >> broker
> > > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > > >> > >> > >> > > > > > > We
> > > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > > >> > >> > >> > > > > > > > > > > > do that. Alternatively, we
> > > could
> > > > > > just
> > > > > > > > let
> > > > > > > > >> the
> > > > > > > > >> > >> > admin
> > > > > > > > >> > >> > >> > > > > configure a
> > > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > > >> > >> > >> > > > > > > > > > > > for the kafka user (it may
> > not
> > > > be
> > > > > > able
> > > > > > > > to
> > > > > > > > >> do
> > > > > > > > >> > >> that
> > > > > > > > >> > >> > >> > easily
> > > > > > > > >> > >> > >> > > > > based
> > > > > > > > >> > >> > >> > > > > > on
> > > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we want to be able
> > to
> > > > > > protect
> > > > > > > > the
> > > > > > > > >> > >> > >> utilization
> > > > > > > > >> > >> > >> > of
> > > > > > > > >> > >> > >> > > > the
> > > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > > >> > >> > >> > > > > > > > > > > > pool too. The difficult is
> > > > mostly
> > > > > > what
> > > > > > > > >> Rajini
> > > > > > > > >> > >> > said:
> > > > > > > > >> > >> > >> (1)
> > > > > > > > >> > >> > >> > > The
> > > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > >> > >> > >> > > > > > > > > > > > throttling the requests is
> > > > through
> > > > > > > > >> Purgatory
> > > > > > > > >> > >> and
> > > > > > > > >> > >> > we
> > > > > > > > >> > >> > >> > will
> > > > > > > > >> > >> > >> > > > have
> > > > > > > > >> > >> > >> > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > > >> > >> > >> > > > > > > > > > > > through how to integrate
> > that
> > > > into
> > > > > > the
> > > > > > > > >> > network
> > > > > > > > >> > >> > >> layer.
> > > > > > > > >> > >> > >> > > (2)
> > > > > > > > >> > >> > >> > > > In
> > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > > >> > >> > >> > > > > > > > > > > > layer, currently we know
> the
> > > > user,
> > > > > > but
> > > > > > > > not
> > > > > > > > >> > the
> > > > > > > > >> > >> > >> clientId
> > > > > > > > >> > >> > >> > > of
> > > > > > > > >> > >> > >> > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit tricky to
> > throttle
> > > > > based
> > > > > > on
> > > > > > > > >> > clientId
> > > > > > > > >> > >> > >> there.
> > > > > > > > >> > >> > >> > > > Plus,
> > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > > >> > >> > >> > > > > > > > > > > > quota can already protect
> > the
> > > > > > network
> > > > > > > > >> thread
> > > > > > > > >> > >> > >> > utilization
> > > > > > > > >> > >> > >> > > > for
> > > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > > >> > >> > >> > > > > > > > > > > > requests. So, if we can't
> > > figure
> > > > > out
> > > > > > > > this
> > > > > > > > >> > part
> > > > > > > > >> > >> > right
> > > > > > > > >> > >> > >> > now,
> > > > > > > > >> > >> > >> > > > > just
> > > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > >> > >> > >> > > > > > > > > > > > the request handling
> threads
> > > for
> > > > > > this
> > > > > > > > KIP
> > > > > > > > >> is
> > > > > > > > >> > >> > still a
> > > > > > > > >> > >> > >> > > useful
> > > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at
> 4:27
> > > AM,
> > > > > > > Rajini
> > > > > > > > >> > >> Sivaram <
> > > > > > > > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > Thank you all for the
> > > > feedback.
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have removed
> > > exemption
> > > > > for
> > > > > > > > >> consumer
> > > > > > > > >> > >> > >> heartbeat
> > > > > > > > >> > >> > >> > > etc.
> > > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > > >> > >> > >> > > > > > > > > > > > > protecting the cluster
> is
> > > more
> > > > > > > > important
> > > > > > > > >> > than
> > > > > > > > >> > >> > >> > > protecting
> > > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > > >> > >> > >> > > > > > > > > > > > > Have retained the
> > exemption
> > > > for
> > > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled only if
> > > > authorization
> > > > > > > fails
> > > > > > > > >> (so
> > > > > > > > >> > >> can't
> > > > > > > > >> > >> > be
> > > > > > > > >> > >> > >> > used
> > > > > > > > >> > >> > >> > > > for
> > > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > > >> > >> > >> > > > > > > > > > > > > a secure cluster, but
> > allows
> > > > > > > > >> inter-broker
> > > > > > > > >> > >> > >> requests to
> > > > > > > > >> > >> > >> > > > > > complete
> > > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > I will wait another day
> to
> > > see
> > > > > if
> > > > > > > > these
> > > > > > > > >> is
> > > > > > > > >> > >> any
> > > > > > > > >> > >> > >> > > objection
> > > > > > > > >> > >> > >> > > > to
> > > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > > >> > >> > >> > > > > > > > > > > > > request processing time
> > (as
> > > > > > opposed
> > > > > > > to
> > > > > > > > >> > >> request
> > > > > > > > >> > >> > >> rate)
> > > > > > > > >> > >> > >> > > and
> > > > > > > > >> > >> > >> > > > if
> > > > > > > > >> > >> > >> > > > > > > there
> > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > > >> > >> > >> > > > > > > > > > > > > objections, I will
> revert
> > to
> > > > the
> > > > > > > > >> original
> > > > > > > > >> > >> > proposal
> > > > > > > > >> > >> > >> > with
> > > > > > > > >> > >> > >> > > > > some
> > > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > The original proposal
> was
> > > only
> > > > > > > > including
> > > > > > > > >> > the
> > > > > > > > >> > >> > time
> > > > > > > > >> > >> > >> > used
> > > > > > > > >> > >> > >> > > by
> > > > > > > > >> > >> > >> > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > > >> > >> > >> > > > > > > > > > > > > handler threads (that
> made
> > > > > > > calculation
> > > > > > > > >> > >> easy). I
> > > > > > > > >> > >> > >> think
> > > > > > > > >> > >> > >> > > the
> > > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > > > > > include the time spent
> in
> > > the
> > > > > > > network
> > > > > > > > >> > >> threads as
> > > > > > > > >> > >> > >> well
> > > > > > > > >> > >> > >> > > > since
> > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > > >> > >> > >> > > > > > > > > > > > > significant. As Jay
> > pointed
> > > > out,
> > > > > > it
> > > > > > > is
> > > > > > > > >> more
> > > > > > > > >> > >> > >> > complicated
> > > > > > > > >> > >> > >> > > > to
> > > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > > > total available CPU time
> > and
> > > > > > convert
> > > > > > > > to
> > > > > > > > >> a
> > > > > > > > >> > >> ratio
> > > > > > > > >> > >> > >> when
> > > > > > > > >> > >> > >> > > > there
> > > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > > >> > >> > >> > > > > > > > > > > > > and *n* network threads.
> > > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > > >> > >> > >> > > )
> > > > > > > > >> > >> > >> > > > > may
> > > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > > >> > >> > >> > > > > > > > > > > > > we want, but it can be
> > very
> > > > > > > expensive
> > > > > > > > on
> > > > > > > > >> > some
> > > > > > > > >> > >> > >> > > platforms.
> > > > > > > > >> > >> > >> > > > As
> > > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang have pointed
> out,
> > > we
> > > > do
> > > > > > > have
> > > > > > > > >> > several
> > > > > > > > >> > >> > time
> > > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > >> > >> > >> > > > > > > > > > > > > generating metrics that
> we
> > > > could
> > > > > > > use,
> > > > > > > > >> > though
> > > > > > > > >> > >> we
> > > > > > > > >> > >> > >> might
> > > > > > > > >> > >> > >> > > > want
> > > > > > > > >> > >> > >> > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime() instead of
> > > > > > > > >> currentTimeMillis()
> > > > > > > > >> > >> since
> > > > > > > > >> > >> > >> some
> > > > > > > > >> > >> > >> > of
> > > > > > > > >> > >> > >> > > > the
> > > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > >> > >> > >> > > > > > > > > > > > > small requests may be <
> > 1ms.
> > > > But
> > > > > > > > rather
> > > > > > > > >> > than
> > > > > > > > >> > >> add
> > > > > > > > >> > >> > >> up
> > > > > > > > >> > >> > >> > the
> > > > > > > > >> > >> > >> > > > > time
> > > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > > >> > >> > >> > > > > > > > > > > > > thread and network
> thread,
> > > > > > wouldn't
> > > > > > > it
> > > > > > > > >> be
> > > > > > > > >> > >> better
> > > > > > > > >> > >> > >> to
> > > > > > > > >> > >> > >> > > > convert
> > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > > >> > >> > >> > > > > > > > > > > > > on each thread into a
> > > separate
> > > > > > > ratio?
> > > > > > > > >> UserA
> > > > > > > > >> > >> has
> > > > > > > > >> > >> > a
> > > > > > > > >> > >> > >> > > request
> > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > > >> > >> > >> > > > > > > > > > > > > we take that to mean
> that
> > > > UserA
> > > > > > can
> > > > > > > > use
> > > > > > > > >> 5%
> > > > > > > > >> > of
> > > > > > > > >> > >> > the
> > > > > > > > >> > >> > >> > time
> > > > > > > > >> > >> > >> > > on
> > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of the time on
> I/O
> > > > > threads?
> > > > > > > If
> > > > > > > > >> > either
> > > > > > > > >> > >> is
> > > > > > > > >> > >> > >> > > exceeded,
> > > > > > > > >> > >> > >> > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > > >> > >> > >> > > > > > > > > > > > > throttled - it would
> mean
> > > > > > > maintaining
> > > > > > > > >> two
> > > > > > > > >> > >> sets
> > > > > > > > >> > >> > of
> > > > > > > > >> > >> > >> > > metrics
> > > > > > > > >> > >> > >> > > > > for
> > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > > >> > >> > >> > > > > > > > > > > > > durations, but would
> > result
> > > in
> > > > > > more
> > > > > > > > >> > >> meaningful
> > > > > > > > >> > >> > >> > ratios.
> > > > > > > > >> > >> > >> > > We
> > > > > > > > >> > >> > >> > > > > > could
> > > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > > >> > >> > >> > > > > > > > > > > > > quota limits (UserA has
> 5%
> > > of
> > > > > > > request
> > > > > > > > >> > threads
> > > > > > > > >> > >> > and
> > > > > > > > >> > >> > >> 10%
> > > > > > > > >> > >> > >> > > of
> > > > > > > > >> > >> > >> > > > > > > network
> > > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > > >> > >> > >> > > > > > > > > > > > > but that seems
> unnecessary
> > > and
> > > > > > > harder
> > > > > > > > to
> > > > > > > > >> > >> explain
> > > > > > > > >> > >> > >> to
> > > > > > > > >> > >> > >> > > > users.
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > Back to why and how
> quotas
> > > are
> > > > > > > applied
> > > > > > > > >> to
> > > > > > > > >> > >> > network
> > > > > > > > >> > >> > >> > > thread
> > > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > > >> > >> > >> > > > > > > > > > > > > a) In the case of fetch,
> > > the
> > > > > time
> > > > > > > > >> spent in
> > > > > > > > >> > >> the
> > > > > > > > >> > >> > >> > network
> > > > > > > > >> > >> > >> > > > > > thread
> > > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > > >> > >> > >> > > > > > > > > > > > > significant and I can
> see
> > > the
> > > > > need
> > > > > > > to
> > > > > > > > >> > include
> > > > > > > > >> > >> > >> this.
> > > > > > > > >> > >> > >> > Are
> > > > > > > > >> > >> > >> > > > > there
> > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > >> > >> > >> > > > > > > > > > > > > requests where the
> network
> > > > > thread
> > > > > > > > >> > >> utilization is
> > > > > > > > >> > >> > >> > > > > significant?
> > > > > > > > >> > >> > >> > > > > > > In
> > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > > >> > >> > >> > > > > > > > > > > > > of fetch, request
> handler
> > > > thread
> > > > > > > > >> > utilization
> > > > > > > > >> > >> > would
> > > > > > > > >> > >> > >> > > > throttle
> > > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > > >> > >> > >> > > > > > > > > > > > > high request rate, low
> > data
> > > > > volume
> > > > > > > and
> > > > > > > > >> > fetch
> > > > > > > > >> > >> > byte
> > > > > > > > >> > >> > >> > rate
> > > > > > > > >> > >> > >> > > > > quota
> > > > > > > > >> > >> > >> > > > > > > will
> > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > >> > >> > >> > > > > > > > > > > > > clients with high data
> > > volume.
> > > > > > > Network
> > > > > > > > >> > thread
> > > > > > > > >> > >> > >> > > utilization
> > > > > > > > >> > >> > >> > > > > is
> > > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > > >> > >> > >> > > > > > > > > > > > > proportional to the data
> > > > > volume. I
> > > > > > > am
> > > > > > > > >> > >> wondering
> > > > > > > > >> > >> > >> if we
> > > > > > > > >> > >> > >> > > > even
> > > > > > > > >> > >> > >> > > > > > need
> > > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > > >> > >> > >> > > > > > > > > > > > > based on network thread
> > > > > > utilization
> > > > > > > or
> > > > > > > > >> > >> whether
> > > > > > > > >> > >> > the
> > > > > > > > >> > >> > >> > data
> > > > > > > > >> > >> > >> > > > > > volume
> > > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > b) At the moment, we
> > record
> > > > and
> > > > > > > check
> > > > > > > > >> for
> > > > > > > > >> > >> quota
> > > > > > > > >> > >> > >> > > violation
> > > > > > > > >> > >> > >> > > > > at
> > > > > > > > >> > >> > >> > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > > >> > >> > >> > > > > > > > > > > > > If a quota is violated,
> > the
> > > > > > response
> > > > > > > > is
> > > > > > > > >> > >> delayed.
> > > > > > > > >> > >> > >> > Using
> > > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > > >> > >> > >> > > > > > > > > > > > > disk reads for fetches
> > > > happening
> > > > > > in
> > > > > > > > the
> > > > > > > > >> > >> network
> > > > > > > > >> > >> > >> > thread,
> > > > > > > > >> > >> > >> > > > We
> > > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > > >> > >> > >> > > > > > > > > > > > > delay a response after
> the
> > > > disk
> > > > > > > reads.
> > > > > > > > >> We
> > > > > > > > >> > >> could
> > > > > > > > >> > >> > >> > record
> > > > > > > > >> > >> > >> > > > the
> > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > > >> > >> > >> > > > > > > > > > > > > the network thread when
> > the
> > > > > > response
> > > > > > > > is
> > > > > > > > >> > >> complete
> > > > > > > > >> > >> > >> and
> > > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > > >> > >> > >> > > > > > > a
> > > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > > >> > >> > >> > > > > > > > > > > > > handling a subsequent
> > > request
> > > > > > > > (separate
> > > > > > > > >> out
> > > > > > > > >> > >> > >> recording
> > > > > > > > >> > >> > >> > > and
> > > > > > > > >> > >> > >> > > > > > quota
> > > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > > >> > >> > >> > > > > > > > > > > > > handling in the case of
> > > > network
> > > > > > > thread
> > > > > > > > >> > >> > overload).
> > > > > > > > >> > >> > >> > Does
> > > > > > > > >> > >> > >> > > > that
> > > > > > > > >> > >> > >> > > > > > > make
> > > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at
> > 2:58
> > > > AM,
> > > > > > > > Becket
> > > > > > > > >> > Qin <
> > > > > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that
> > > enforcing
> > > > > the
> > > > > > > CPU
> > > > > > > > >> time
> > > > > > > > >> > >> is a
> > > > > > > > >> > >> > >> > little
> > > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > that maybe we can use
> > the
> > > > > > existing
> > > > > > > > >> > request
> > > > > > > > >> > >> > >> > > statistics.
> > > > > > > > >> > >> > >> > > > > They
> > > > > > > > >> > >> > >> > > > > > > are
> > > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > very detailed so we
> can
> > > > > probably
> > > > > > > see
> > > > > > > > >> the
> > > > > > > > >> > >> > >> > approximate
> > > > > > > > >> > >> > >> > > > CPU
> > > > > > > > >> > >> > >> > > > > > time
> > > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > something like
> > > (total_time -
> > > > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > > > >> > >> > >> > > > > -
> > > > > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > I agree with Guozhang
> > that
> > > > > when
> > > > > > a
> > > > > > > > >> user is
> > > > > > > > >> > >> > >> throttled
> > > > > > > > >> > >> > >> > > it
> > > > > > > > >> > >> > >> > > > is
> > > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > need to see if
> anything
> > > has
> > > > > went
> > > > > > > > wrong
> > > > > > > > >> > >> first,
> > > > > > > > >> > >> > >> and
> > > > > > > > >> > >> > >> > if
> > > > > > > > >> > >> > >> > > > the
> > > > > > > > >> > >> > >> > > > > > > users
> > > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > behaving and just need
> > > more
> > > > > > > > >> resources, we
> > > > > > > > >> > >> will
> > > > > > > > >> > >> > >> have
> > > > > > > > >> > >> > >> > > to
> > > > > > > > >> > >> > >> > > > > bump
> > > > > > > > >> > >> > >> > > > > > > up
> > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > for them. It is true
> > that
> > > > > > > > >> pre-allocating
> > > > > > > > >> > >> CPU
> > > > > > > > >> > >> > >> time
> > > > > > > > >> > >> > >> > > quota
> > > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > users is difficult. So
> > in
> > > > > > practice
> > > > > > > > it
> > > > > > > > >> > would
> > > > > > > > >> > >> > >> > probably
> > > > > > > > >> > >> > >> > > be
> > > > > > > > >> > >> > >> > > > > > more
> > > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > a relative high
> > protective
> > > > CPU
> > > > > > > time
> > > > > > > > >> quota
> > > > > > > > >> > >> for
> > > > > > > > >> > >> > >> > > everyone
> > > > > > > > >> > >> > >> > > > > and
> > > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > for some individual
> > > clients
> > > > on
> > > > > > > > demand.
> > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017
> at
> > > 5:48
> > > > > PM,
> > > > > > > > >> Guozhang
> > > > > > > > >> > >> > Wang <
> > > > > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > This is a great
> > > proposal,
> > > > > glad
> > > > > > > to
> > > > > > > > >> see
> > > > > > > > >> > it
> > > > > > > > >> > >> > >> > happening.
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am inclined to the
> > CPU
> > > > > > > > >> throttling, or
> > > > > > > > >> > >> more
> > > > > > > > >> > >> > >> > > > > specifically
> > > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio instead of the
> > > > request
> > > > > > > rate
> > > > > > > > >> > >> throttling
> > > > > > > > >> > >> > >> as
> > > > > > > > >> > >> > >> > > well.
> > > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > summed my rationales
> > > > above,
> > > > > > and
> > > > > > > > one
> > > > > > > > >> > >> thing to
> > > > > > > > >> > >> > >> add
> > > > > > > > >> > >> > >> > > here
> > > > > > > > >> > >> > >> > > > > is
> > > > > > > > >> > >> > >> > > > > > > that
> > > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > has a good support
> for
> > > > both
> > > > > > > > >> "protecting
> > > > > > > > >> > >> > >> against
> > > > > > > > >> > >> > >> > > rogue
> > > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster
> > for
> > > > > > > > >> multi-tenancy
> > > > > > > > >> > >> > usage":
> > > > > > > > >> > >> > >> > when
> > > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > explain this to the
> > end
> > > > > > users, I
> > > > > > > > >> find
> > > > > > > > >> > it
> > > > > > > > >> > >> > >> actually
> > > > > > > > >> > >> > >> > > > more
> > > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > request rate since
> as
> > > > > > mentioned
> > > > > > > > >> above,
> > > > > > > > >> > >> > >> different
> > > > > > > > >> > >> > >> > > > > requests
> > > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > different "cost",
> and
> > > > Kafka
> > > > > > > today
> > > > > > > > >> > already
> > > > > > > > >> > >> > have
> > > > > > > > >> > >> > >> > > > various
> > > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > (produce, fetch,
> > admin,
> > > > > > > metadata,
> > > > > > > > >> etc),
> > > > > > > > >> > >> > >> because
> > > > > > > > >> > >> > >> > of
> > > > > > > > >> > >> > >> > > > that
> > > > > > > > >> > >> > >> > > > > > the
> > > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttling may not
> be
> > as
> > > > > > > effective
> > > > > > > > >> > >> unless it
> > > > > > > > >> > >> > >> is
> > > > > > > > >> > >> > >> > set
> > > > > > > > >> > >> > >> > > > > very
> > > > > > > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > Regarding to user
> > > > reactions
> > > > > > when
> > > > > > > > >> they
> > > > > > > > >> > are
> > > > > > > > >> > >> > >> > > throttled,
> > > > > > > > >> > >> > >> > > > I
> > > > > > > > >> > >> > >> > > > > > > think
> > > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > case-by-case, and
> need
> > > to
> > > > be
> > > > > > > > >> > discovered /
> > > > > > > > >> > >> > >> guided
> > > > > > > > >> > >> > >> > by
> > > > > > > > >> > >> > >> > > > > > looking
> > > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > metrics. So in other
> > > words
> > > > > > users
> > > > > > > > >> would
> > > > > > > > >> > >> not
> > > > > > > > >> > >> > >> expect
> > > > > > > > >> > >> > >> > > to
> > > > > > > > >> > >> > >> > > > > get
> > > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > information by
> simply
> > > > being
> > > > > > told
> > > > > > > > >> "hey,
> > > > > > > > >> > >> you
> > > > > > > > >> > >> > are
> > > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > what throttling
> does;
> > > they
> > > > > > need
> > > > > > > to
> > > > > > > > >> > take a
> > > > > > > > >> > >> > >> > follow-up
> > > > > > > > >> > >> > >> > > > > step
> > > > > > > > >> > >> > >> > > > > > > and
> > > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttled probably
> > > because
> > > > > of
> > > > > > > ..",
> > > > > > > > >> > which
> > > > > > > > >> > >> is
> > > > > > > > >> > >> > by
> > > > > > > > >> > >> > >> > > > looking
> > > > > > > > >> > >> > >> > > > > at
> > > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > > >> > >> > >> > > > > > > > > > > > > > > values: e.g. whether
> > I'm
> > > > > > > > bombarding
> > > > > > > > >> the
> > > > > > > > >> > >> > >> brokers
> > > > > > > > >> > >> > >> > > with
> > > > > > > > >> > >> > >> > > > >
> > > > > > > > >>
> > > > > > > > > ...
> > > > > > > > >
> > > > > > > > > [Message clipped]
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Jiangjie,

Yes, I agree that byte rate already protects the network threads
indirectly. I am not sure if byte rate fully captures the CPU overhead in
network due to SSL. So, at the high level, we can use request time limit to
protect CPU and use byte rate to protect storage and network.

Also, do you think you can get Todd to comment on this KIP?

Thanks,

Jun

On Tue, Mar 7, 2017 at 11:21 AM, Becket Qin <be...@gmail.com> wrote:

> Hi Rajini/Jun,
>
> The percentage based reasoning sounds good.
> One thing I am wondering is that if we assume the network thread are just
> doing the network IO, can we say bytes rate quota is already sort of
> network threads quota?
> If we take network threads into the consideration here, would that be
> somewhat overlapping with the bytes rate quota?
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
> On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Jun,
> >
> > Thank you for the explanation, I hadn't realized you meant percentage of
> > the total thread pool. If everyone is OK with Jun's suggestion, I will
> > update the KIP.
> >
> > Thanks,
> >
> > Rajini
> >
> > On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Rajini,
> > >
> > > Let's take your example. Let's say a user sets the limit to 50%. I am
> not
> > > sure if it's better to apply the same percentage separately to network
> > and
> > > io thread pool. For example, for produce requests, most of the time
> will
> > be
> > > spent in the io threads whereas for fetch requests, most of the time
> will
> > > be in the network threads. So, using the same percentage in both thread
> > > pools means one of the pools' resource will be over allocated.
> > >
> > > An alternative way is to simply model network and io thread pool
> > together.
> > > If you get 10 io threads and 5 network threads, you get 1500% request
> > > processing power. A 50% limit means a total of 750% processing power.
> We
> > > just add up the time a user request spent in either network or io
> thread.
> > > If that total exceeds 750% (doesn't matter whether it's spent more in
> > > network or io thread), the request will be throttled. This seems more
> > > general and is not sensitive to the current implementation detail of
> > having
> > > a separate network and io thread pool. In the future, if the threading
> > > model changes, the same concept of quota can still be applied. For now,
> > > since it's a bit tricky to add the delay logic in the network thread
> > pool,
> > > we could probably just do the delaying only in the io threads as you
> > > suggested earlier.
> > >
> > > There is still the orthogonal question of whether a quota of 50% is out
> > of
> > > 100% or 100% * #total processing threads. My feeling is that the latter
> > is
> > > slightly better based on my explanation earlier. The way to describe
> this
> > > quota to the users can be "share of elapsed request processing time on
> a
> > > single CPU" (similar to top).
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <
> rajinisivaram@gmail.com>
> > > wrote:
> > >
> > > > Jun,
> > > >
> > > > Agree about the two scenarios.
> > > >
> > > > But still not sure about a single quota covering both network threads
> > and
> > > > I/O threads with per-thread quota. If there are 10 I/O threads and 5
> > > > network threads and I want to assign half the quota to userA, the
> quota
> > > > would be 750%. I imagine, internally, we would convert this to 500%
> for
> > > I/O
> > > > and 250% for network threads to allocate 50% of each pool.
> > > >
> > > > A couple of scenarios:
> > > >
> > > > 1. Admin adds 1 extra network thread. To retain 50%, admin needs to
> now
> > > > allocate 800% for each user. Or increase the quota for a few users.
> To
> > > me,
> > > > it feels like admin needs to convert 50% to 800% and Kafka internally
> > > needs
> > > > to convert 800% to (500%, 300%). Everyone using just 50% feels a lot
> > > > simpler.
> > > >
> > > > 2. We decide to add some other thread to this list. Admin needs to
> know
> > > > exactly how many threads form the maximum quota. And we can be
> changing
> > > > this between broker versions as we add more to the list. Again a
> single
> > > > overall percent would be a lot simpler.
> > > >
> > > > There were others who were unconvinced by a single percent from the
> > > initial
> > > > proposal and were happier with thread units similar to CPU units, so
> I
> > am
> > > > ok with going with per-thread quotas (as units or percent). Just not
> > sure
> > > > it makes it easier for admin in all cases.
> > > >
> > > > Regards,
> > > >
> > > > Rajini
> > > >
> > > >
> > > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Hi, Rajini,
> > > > >
> > > > > Consider modeling as n * 100% unit. For 2), the question is what's
> > > > causing
> > > > > the I/O threads to be saturated. It's unlikely that all users'
> > > > utilization
> > > > > have increased at the same. A more likely case is that a few
> isolated
> > > > > users' utilization have increased. If so, after increasing the
> number
> > > of
> > > > > threads, the admin just needs to adjust the quota for a few
> isolated
> > > > users,
> > > > > which is expected and is less work.
> > > > >
> > > > > Consider modeling as 1 * 100% unit. For 1), all users' quota need
> to
> > be
> > > > > adjusted, which is unexpected and is more work.
> > > > >
> > > > > So, to me, the n * 100% model seems more convenient.
> > > > >
> > > > > As for future extension to cover network thread utilization, I was
> > > > thinking
> > > > > that one way is to simply model the capacity as (n + m) * 100%
> unit,
> > > > where
> > > > > n and m are the number of network and i/o threads, respectively.
> > Then,
> > > > for
> > > > > each user, we can just add up the utilization in the network and
> the
> > > i/o
> > > > > thread. If we do this, we don't need a new type of quota.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > > rajinisivaram@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Jun,
> > > > > >
> > > > > > If we use request.percentage as the percentage used in a single
> I/O
> > > > > thread,
> > > > > > the total percentage being allocated will be num.io.threads * 100
> > for
> > > > I/O
> > > > > > threads and num.network.threads * 100 for network threads. A
> single
> > > > quota
> > > > > > covering the two as a percentage wouldn't quite work if you want
> to
> > > > > > allocate the same proportion in both cases. If we want to treat
> > > threads
> > > > > as
> > > > > > separate units, won't we need two quota configurations regardless
> > of
> > > > > > whether we use units or percentage? Perhaps I misunderstood your
> > > > > > suggestion.
> > > > > >
> > > > > > I think there are two cases:
> > > > > >
> > > > > >    1. The use case that you mentioned where an admin is adding
> more
> > > > users
> > > > > >    and decides to add more I/O threads and expects to find free
> > quota
> > > > to
> > > > > >    allocate for new users.
> > > > > >    2. Admin adds more I/O threads because the I/O threads are
> > > saturated
> > > > > and
> > > > > >    there are cores available to allocate, even though the number
> or
> > > > > >    users/clients hasn't changed.
> > > > > >
> > > > > > If we allocated treated I/O threads as a single unit of 100%, all
> > > user
> > > > > > quotas need to be reallocated for 1). If we allocated I/O threads
> > as
> > > n
> > > > > > units with n*100%, all user quotas need to be reallocated for 2),
> > > > > otherwise
> > > > > > some of the new threads may just not be used. Either way it
> should
> > be
> > > > > easy
> > > > > > to write a script to decrease/increase quotas by a multiple for
> all
> > > > > users.
> > > > > >
> > > > > > So it really boils down to which quota unit is most intuitive in
> > > terms
> > > > of
> > > > > > configuration. And from the discussion so far, it feels like
> > opinion
> > > is
> > > > > > divided on whether quotas should be carved out of an absolute
> 100%
> > > (or
> > > > 1
> > > > > > unit) or be relative to the number of threads (n*100% or n
> units).
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Another way to express an absolute limit is to use
> > > > request.percentage,
> > > > > > but
> > > > > > > treat it as the percentage used in a single request handling
> > > thread.
> > > > > For
> > > > > > > now, the request handling threads can be just the io threads.
> In
> > > the
> > > > > > > future, they can cover the network threads as well. This is
> > similar
> > > > to
> > > > > > how
> > > > > > > top reports CPU usage and may be a bit easier for people to
> > > > understand.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <ju...@confluent.io>
> > > wrote:
> > > > > > >
> > > > > > > > Hi, Jay,
> > > > > > > >
> > > > > > > > 2. Regarding request.unit vs request.percentage. I started
> with
> > > > > > > > request.percentage too. The reasoning for request.unit is the
> > > > > > following.
> > > > > > > > Suppose that the capacity has been reached on a broker and
> the
> > > > admin
> > > > > > > needs
> > > > > > > > to add a new user. A simple way to increase the capacity is
> to
> > > > > increase
> > > > > > > the
> > > > > > > > number of io threads, assuming there are still enough cores.
> If
> > > the
> > > > > > limit
> > > > > > > > is based on percentage, the additional capacity automatically
> > > gets
> > > > > > > > distributed to existing users and we haven't really carved
> out
> > > any
> > > > > > > > additional resource for the new user. Now, is it easy for a
> > user
> > > to
> > > > > > > reason
> > > > > > > > about 0.1 unit vs 10%. My feeling is that both are hard and
> > have
> > > to
> > > > > be
> > > > > > > > configured empirically. Not sure if percentage is obviously
> > > easier
> > > > to
> > > > > > > > reason about.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <jay@confluent.io
> >
> > > > wrote:
> > > > > > > >
> > > > > > > >> A couple of quick points:
> > > > > > > >>
> > > > > > > >> 1. Even though the implementation of this quota is only
> using
> > io
> > > > > > thread
> > > > > > > >> time, i think we should call it something like
> "request-time".
> > > > This
> > > > > > will
> > > > > > > >> give us flexibility to improve the implementation to cover
> > > network
> > > > > > > threads
> > > > > > > >> in the future and will avoid exposing internal details like
> > our
> > > > > thread
> > > > > > > >> pools on the server.
> > > > > > > >>
> > > > > > > >> 2. Jun/Roger, I get what you are trying to fix but the idea
> of
> > > > > > > >> thread/units
> > > > > > > >> is super unintuitive as a user-facing knob. I had to read
> the
> > > KIP
> > > > > like
> > > > > > > >> eight times to understand this. I'm not sure that your point
> > > that
> > > > > > > >> increasing the number of threads is a problem with a
> > > > > percentage-based
> > > > > > > >> value, it really depends on whether the user thinks about
> the
> > > > > > > "percentage
> > > > > > > >> of request processing time" or "thread units". If they think
> > "I
> > > > have
> > > > > > > >> allocated 10% of my request processing time to user x" then
> it
> > > is
> > > > a
> > > > > > bug
> > > > > > > >> that increasing the thread count decreases that percent as
> it
> > > does
> > > > > in
> > > > > > > the
> > > > > > > >> current proposal. As a practical matter I think the only way
> > to
> > > > > > actually
> > > > > > > >> reason about this is as a percent---I just don't believe
> > people
> > > > are
> > > > > > > going
> > > > > > > >> to think, "ah, 4.3 thread units, that is the right amount!".
> > > > > Instead I
> > > > > > > >> think they have to understand this thread unit concept,
> figure
> > > out
> > > > > > what
> > > > > > > >> they have set in number of threads, compute a percent and
> then
> > > > come
> > > > > up
> > > > > > > >> with
> > > > > > > >> the number of thread units, and these will all be wrong if
> > that
> > > > > thread
> > > > > > > >> count changes. I also think this ties us to throttling the
> I/O
> > > > > thread
> > > > > > > >> pool,
> > > > > > > >> which may not be where we want to end up.
> > > > > > > >>
> > > > > > > >> 3. For what it's worth I do think having a single
> throttle_ms
> > > > field
> > > > > in
> > > > > > > all
> > > > > > > >> the responses that combines all throttling from all quotas
> is
> > > > > probably
> > > > > > > the
> > > > > > > >> simplest. There could be a use case for having separate
> fields
> > > for
> > > > > > each,
> > > > > > > >> but I think that is actually harder to use/monitor in the
> > common
> > > > > case
> > > > > > so
> > > > > > > >> unless someone has a use case I think just one should be
> fine.
> > > > > > > >>
> > > > > > > >> -Jay
> > > > > > > >>
> > > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > > > > > rajinisivaram@gmail.com>
> > > > > > > >> wrote:
> > > > > > > >>
> > > > > > > >> > I have updated the KIP based on the discussions so far.
> > > > > > > >> >
> > > > > > > >> >
> > > > > > > >> > Regards,
> > > > > > > >> >
> > > > > > > >> > Rajini
> > > > > > > >> >
> > > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > > > > > >> rajinisivaram@gmail.com>
> > > > > > > >> > wrote:
> > > > > > > >> >
> > > > > > > >> > > Thank you all for the feedback.
> > > > > > > >> > >
> > > > > > > >> > > Ismael #1. It makes sense not to throttle inter-broker
> > > > requests
> > > > > > like
> > > > > > > >> > > LeaderAndIsr etc. The simplest way to ensure that
> clients
> > > > cannot
> > > > > > use
> > > > > > > >> > these
> > > > > > > >> > > requests to bypass quotas for DoS attacks is to ensure
> > that
> > > > ACLs
> > > > > > > >> prevent
> > > > > > > >> > > clients from using these requests and unauthorized
> > requests
> > > > are
> > > > > > > >> included
> > > > > > > >> > > towards quotas.
> > > > > > > >> > >
> > > > > > > >> > > Ismael #2, Jay #1 : I was thinking that these quotas can
> > > > return
> > > > > a
> > > > > > > >> > separate
> > > > > > > >> > > throttle time, and all utilization based quotas could
> use
> > > the
> > > > > same
> > > > > > > >> field
> > > > > > > >> > > (we won't add another one for network thread utilization
> > for
> > > > > > > >> instance).
> > > > > > > >> > But
> > > > > > > >> > > perhaps it makes sense to keep byte rate quotas separate
> > in
> > > > > > > >> produce/fetch
> > > > > > > >> > > responses to provide separate metrics? Agree with Ismael
> > > that
> > > > > the
> > > > > > > >> name of
> > > > > > > >> > > the existing field should be changed if we have two.
> Happy
> > > to
> > > > > > switch
> > > > > > > >> to a
> > > > > > > >> > > single combined throttle time if that is sufficient.
> > > > > > > >> > >
> > > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot
> separated
> > > > name
> > > > > > for
> > > > > > > >> new
> > > > > > > >> > > property. Replication quotas use dot separated, so it
> will
> > > be
> > > > > > > >> consistent
> > > > > > > >> > > with all properties except byte rate quotas.
> > > > > > > >> > >
> > > > > > > >> > > Radai: #1 Request processing time rather than request
> rate
> > > > were
> > > > > > > chosen
> > > > > > > >> > > because the time per request can vary significantly
> > between
> > > > > > requests
> > > > > > > >> as
> > > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > > >> > > #2 Two separate quotas for heartbeats/regular requests
> > feel
> > > > like
> > > > > > > more
> > > > > > > >> > > configuration and more metrics. Since most users would
> set
> > > > > quotas
> > > > > > > >> higher
> > > > > > > >> > > than the expected usage and quotas are more of a safety
> > > net, a
> > > > > > > single
> > > > > > > >> > quota
> > > > > > > >> > > should work in most cases.
> > > > > > > >> > >  #3 The number of requests in purgatory is limited by
> the
> > > > number
> > > > > > of
> > > > > > > >> > active
> > > > > > > >> > > connections since only one request per connection will
> be
> > > > > > throttled
> > > > > > > >> at a
> > > > > > > >> > > time.
> > > > > > > >> > > #4 As with byte rate quotas, to use the full allocated
> > > quotas,
> > > > > > > >> > > clients/users would need to use partitions that are
> > > > distributed
> > > > > > > across
> > > > > > > >> > the
> > > > > > > >> > > cluster. The alternative of using cluster-wide quotas
> > > instead
> > > > of
> > > > > > > >> > per-broker
> > > > > > > >> > > quotas would be far too complex to implement.
> > > > > > > >> > >
> > > > > > > >> > > Dong : We currently have two ClientQuotaManagers for
> quota
> > > > types
> > > > > > > Fetch
> > > > > > > >> > and
> > > > > > > >> > > Produce. A new one will be added for IOThread, which
> > manages
> > > > > > quotas
> > > > > > > >> for
> > > > > > > >> > I/O
> > > > > > > >> > > thread utilization. This will not update the Fetch or
> > > Produce
> > > > > > > >> queue-size,
> > > > > > > >> > > but will have a separate metric for the queue-size.  I
> > > wasn't
> > > > > > > >> planning to
> > > > > > > >> > > add any additional metrics apart from the equivalent
> ones
> > > for
> > > > > > > existing
> > > > > > > >> > > quotas as part of this KIP. Ratio of byte-rate to I/O
> > thread
> > > > > > > >> utilization
> > > > > > > >> > > could be slightly misleading since it depends on the
> > > sequence
> > > > of
> > > > > > > >> > requests.
> > > > > > > >> > > But we can look into more metrics after the KIP is
> > > implemented
> > > > > if
> > > > > > > >> > required.
> > > > > > > >> > >
> > > > > > > >> > > I think we need to limit the maximum delay since all
> > > requests
> > > > > are
> > > > > > > >> > > throttled. If a client has a quota of 0.001 units and a
> > > single
> > > > > > > request
> > > > > > > >> > used
> > > > > > > >> > > 50ms, we don't want to delay all requests from the
> client
> > by
> > > > 50
> > > > > > > >> seconds,
> > > > > > > >> > > throwing the client out of all its consumer groups. The
> > > issue
> > > > is
> > > > > > > only
> > > > > > > >> if
> > > > > > > >> > a
> > > > > > > >> > > user is allocated a quota that is insufficient to
> process
> > > one
> > > > > > large
> > > > > > > >> > > request. The expectation is that the units allocated per
> > > user
> > > > > will
> > > > > > > be
> > > > > > > >> > much
> > > > > > > >> > > higher than the time taken to process one request and
> the
> > > > limit
> > > > > > > should
> > > > > > > >> > > seldom be applied. Agree this needs proper
> documentation.
> > > > > > > >> > >
> > > > > > > >> > > Regards,
> > > > > > > >> > >
> > > > > > > >> > > Rajini
> > > > > > > >> > >
> > > > > > > >> > >
> > > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > > radai.rosenblatt@gmail.com>
> > > > > > > >> > wrote:
> > > > > > > >> > >
> > > > > > > >> > >> @jun: i wasnt concerned about tying up a request
> > processing
> > > > > > thread,
> > > > > > > >> but
> > > > > > > >> > >> IIUC the code does still read the entire request out,
> > which
> > > > > might
> > > > > > > >> add-up
> > > > > > > >> > >> to
> > > > > > > >> > >> a non-negligible amount of memory.
> > > > > > > >> > >>
> > > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> > > > > lindong28@gmail.com>
> > > > > > > >> wrote:
> > > > > > > >> > >>
> > > > > > > >> > >> > Hey Rajini,
> > > > > > > >> > >> >
> > > > > > > >> > >> > The current KIP says that the maximum delay will be
> > > reduced
> > > > > to
> > > > > > > >> window
> > > > > > > >> > >> size
> > > > > > > >> > >> > if it is larger than the window size. I have a
> concern
> > > with
> > > > > > this:
> > > > > > > >> > >> >
> > > > > > > >> > >> > 1) This essentially means that the user is allowed to
> > > > exceed
> > > > > > > their
> > > > > > > >> > quota
> > > > > > > >> > >> > over a long period of time. Can you provide an upper
> > > bound
> > > > on
> > > > > > > this
> > > > > > > >> > >> > deviation?
> > > > > > > >> > >> >
> > > > > > > >> > >> > 2) What is the motivation for cap the maximum delay
> by
> > > the
> > > > > > window
> > > > > > > >> > size?
> > > > > > > >> > >> I
> > > > > > > >> > >> > am wondering if there is better alternative to
> address
> > > the
> > > > > > > problem.
> > > > > > > >> > >> >
> > > > > > > >> > >> > 3) It means that the existing metric-related config
> > will
> > > > > have a
> > > > > > > >> more
> > > > > > > >> > >> > directly impact on the mechanism of this
> > > > io-thread-unit-based
> > > > > > > >> quota.
> > > > > > > >> > The
> > > > > > > >> > >> > may be an important change depending on the answer to
> > 1)
> > > > > above.
> > > > > > > We
> > > > > > > >> > >> probably
> > > > > > > >> > >> > need to document this more explicitly.
> > > > > > > >> > >> >
> > > > > > > >> > >> > Dong
> > > > > > > >> > >> >
> > > > > > > >> > >> >
> > > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> > > > > > lindong28@gmail.com>
> > > > > > > >> > wrote:
> > > > > > > >> > >> >
> > > > > > > >> > >> > > Hey Jun,
> > > > > > > >> > >> > >
> > > > > > > >> > >> > > Yeah you are right. I thought it wasn't because at
> > > > LinkedIn
> > > > > > it
> > > > > > > >> will
> > > > > > > >> > be
> > > > > > > >> > >> > too
> > > > > > > >> > >> > > much pressure on inGraph to expose those
> per-clientId
> > > > > metrics
> > > > > > > so
> > > > > > > >> we
> > > > > > > >> > >> ended
> > > > > > > >> > >> > > up printing them periodically to local log. Never
> > mind
> > > if
> > > > > it
> > > > > > is
> > > > > > > >> not
> > > > > > > >> > a
> > > > > > > >> > >> > > general problem.
> > > > > > > >> > >> > >
> > > > > > > >> > >> > > Hey Rajini,
> > > > > > > >> > >> > >
> > > > > > > >> > >> > > - I agree with Jay that we probably don't want to
> > add a
> > > > new
> > > > > > > field
> > > > > > > >> > for
> > > > > > > >> > >> > > every quota ProduceResponse or FetchResponse. Is
> > there
> > > > any
> > > > > > > >> use-case
> > > > > > > >> > >> for
> > > > > > > >> > >> > > having separate throttle-time fields for
> > > byte-rate-quota
> > > > > and
> > > > > > > >> > >> > > io-thread-unit-quota? You probably need to document
> > > this
> > > > as
> > > > > > > >> > interface
> > > > > > > >> > >> > > change if you plan to add new field in any request.
> > > > > > > >> > >> > >
> > > > > > > >> > >> > > - I don't think IOThread belongs to quotaType. The
> > > > existing
> > > > > > > quota
> > > > > > > >> > >> types
> > > > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > > > n/FollowerReplication)
> > > > > > > >> identify
> > > > > > > >> > >> the
> > > > > > > >> > >> > > type of request that are throttled, not the quota
> > > > mechanism
> > > > > > > that
> > > > > > > >> is
> > > > > > > >> > >> > applied.
> > > > > > > >> > >> > >
> > > > > > > >> > >> > > - If a request is throttled due to this
> > > > > io-thread-unit-based
> > > > > > > >> quota,
> > > > > > > >> > is
> > > > > > > >> > >> > the
> > > > > > > >> > >> > > existing queue-size metric in ClientQuotaManager
> > > > > incremented?
> > > > > > > >> > >> > >
> > > > > > > >> > >> > > - In the interest of providing guide line for admin
> > to
> > > > > decide
> > > > > > > >> > >> > > io-thread-unit-based quota and for user to
> understand
> > > its
> > > > > > > impact
> > > > > > > >> on
> > > > > > > >> > >> their
> > > > > > > >> > >> > > traffic, would it be useful to have a metric that
> > shows
> > > > the
> > > > > > > >> overall
> > > > > > > >> > >> > > byte-rate per io-thread-unit? Can we also show
> this a
> > > > > > > >> per-clientId
> > > > > > > >> > >> > metric?
> > > > > > > >> > >> > >
> > > > > > > >> > >> > > Thanks,
> > > > > > > >> > >> > > Dong
> > > > > > > >> > >> > >
> > > > > > > >> > >> > >
> > > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <
> > > > jun@confluent.io
> > > > > >
> > > > > > > >> wrote:
> > > > > > > >> > >> > >
> > > > > > > >> > >> > >> Hi, Ismael,
> > > > > > > >> > >> > >>
> > > > > > > >> > >> > >> For #3, typically, an admin won't configure more
> io
> > > > > threads
> > > > > > > than
> > > > > > > >> > CPU
> > > > > > > >> > >> > >> cores,
> > > > > > > >> > >> > >> but it's possible for an admin to start with fewer
> > io
> > > > > > threads
> > > > > > > >> than
> > > > > > > >> > >> cores
> > > > > > > >> > >> > >> and grow that later on.
> > > > > > > >> > >> > >>
> > > > > > > >> > >> > >> Hi, Dong,
> > > > > > > >> > >> > >>
> > > > > > > >> > >> > >> I think the throttleTime sensor on the broker
> tells
> > > the
> > > > > > admin
> > > > > > > >> > >> whether a
> > > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > > >> > >> > >>
> > > > > > > >> > >> > >> Hi, Radi,
> > > > > > > >> > >> > >>
> > > > > > > >> > >> > >> The reasoning for delaying the throttled requests
> on
> > > the
> > > > > > > broker
> > > > > > > >> > >> instead
> > > > > > > >> > >> > of
> > > > > > > >> > >> > >> returning an error immediately is that the latter
> > has
> > > no
> > > > > way
> > > > > > > to
> > > > > > > >> > >> prevent
> > > > > > > >> > >> > >> the
> > > > > > > >> > >> > >> client from retrying immediately, which will make
> > > things
> > > > > > > worse.
> > > > > > > >> The
> > > > > > > >> > >> > >> delaying logic is based off a delay queue. A
> > separate
> > > > > > > expiration
> > > > > > > >> > >> thread
> > > > > > > >> > >> > >> just waits on the next to be expired request. So,
> it
> > > > > doesn't
> > > > > > > tie
> > > > > > > >> > up a
> > > > > > > >> > >> > >> request handler thread.
> > > > > > > >> > >> > >>
> > > > > > > >> > >> > >> Thanks,
> > > > > > > >> > >> > >>
> > > > > > > >> > >> > >> Jun
> > > > > > > >> > >> > >>
> > > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> > > > > > > ismael@juma.me.uk
> > > > > > > >> >
> > > > > > > >> > >> wrote:
> > > > > > > >> > >> > >>
> > > > > > > >> > >> > >> > Hi Jay,
> > > > > > > >> > >> > >> >
> > > > > > > >> > >> > >> > Regarding 1, I definitely like the simplicity of
> > > > > keeping a
> > > > > > > >> single
> > > > > > > >> > >> > >> throttle
> > > > > > > >> > >> > >> > time field in the response. The downside is that
> > the
> > > > > > client
> > > > > > > >> > metrics
> > > > > > > >> > >> > >> will be
> > > > > > > >> > >> > >> > more coarse grained.
> > > > > > > >> > >> > >> >
> > > > > > > >> > >> > >> > Regarding 3, we have
> `leader.imbalance.per.broker.
> > > > > > > percentage`
> > > > > > > >> > and
> > > > > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > > > >> > >> > >> >
> > > > > > > >> > >> > >> > Ismael
> > > > > > > >> > >> > >> >
> > > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> > > > > > > jay@confluent.io>
> > > > > > > >> > >> wrote:
> > > > > > > >> > >> > >> >
> > > > > > > >> > >> > >> > > A few minor comments:
> > > > > > > >> > >> > >> > >
> > > > > > > >> > >> > >> > >    1. Isn't it the case that the throttling
> time
> > > > > > response
> > > > > > > >> field
> > > > > > > >> > >> > should
> > > > > > > >> > >> > >> > have
> > > > > > > >> > >> > >> > >    the total time your request was throttled
> > > > > > irrespective
> > > > > > > of
> > > > > > > >> > the
> > > > > > > >> > >> > >> quotas
> > > > > > > >> > >> > >> > > that
> > > > > > > >> > >> > >> > >    caused that. Limiting it to byte rate quota
> > > > doesn't
> > > > > > > make
> > > > > > > >> > >> sense,
> > > > > > > >> > >> > >> but I
> > > > > > > >> > >> > >> > > also
> > > > > > > >> > >> > >> > >    I don't think we want to end up adding new
> > > fields
> > > > > in
> > > > > > > the
> > > > > > > >> > >> response
> > > > > > > >> > >> > >> for
> > > > > > > >> > >> > >> > > every
> > > > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > > > >> > >> > >> > >    2. I don't think we should make this quota
> > > > > > specifically
> > > > > > > >> > about
> > > > > > > >> > >> io
> > > > > > > >> > >> > >> > >    threads. Once we introduce these quotas
> > people
> > > > set
> > > > > > them
> > > > > > > >> and
> > > > > > > >> > >> > expect
> > > > > > > >> > >> > >> > them
> > > > > > > >> > >> > >> > > to
> > > > > > > >> > >> > >> > >    be enforced (and if they aren't it may
> cause
> > an
> > > > > > > outage).
> > > > > > > >> As
> > > > > > > >> > a
> > > > > > > >> > >> > >> result
> > > > > > > >> > >> > >> > > they
> > > > > > > >> > >> > >> > >    are a bit more sensitive than normal
> > configs, I
> > > > > > think.
> > > > > > > >> The
> > > > > > > >> > >> > current
> > > > > > > >> > >> > >> > > thread
> > > > > > > >> > >> > >> > >    pools seem like something of an
> > implementation
> > > > > detail
> > > > > > > and
> > > > > > > >> > not
> > > > > > > >> > >> the
> > > > > > > >> > >> > >> > level
> > > > > > > >> > >> > >> > > the
> > > > > > > >> > >> > >> > >    user-facing quotas should be involved
> with. I
> > > > think
> > > > > > it
> > > > > > > >> might
> > > > > > > >> > >> be
> > > > > > > >> > >> > >> better
> > > > > > > >> > >> > >> > > to
> > > > > > > >> > >> > >> > >    make this a general request-time throttle
> > with
> > > no
> > > > > > > >> mention in
> > > > > > > >> > >> the
> > > > > > > >> > >> > >> > naming
> > > > > > > >> > >> > >> > >    about I/O threads and simply acknowledge
> the
> > > > > current
> > > > > > > >> > >> limitation
> > > > > > > >> > >> > >> (which
> > > > > > > >> > >> > >> > > we
> > > > > > > >> > >> > >> > >    may someday fix) in the docs that this
> covers
> > > > only
> > > > > > the
> > > > > > > >> time
> > > > > > > >> > >> after
> > > > > > > >> > >> > >> the
> > > > > > > >> > >> > >> > >    thread is read off the network.
> > > > > > > >> > >> > >> > >    3. As such I think the right interface to
> the
> > > > user
> > > > > > > would
> > > > > > > >> be
> > > > > > > >> > >> > >> something
> > > > > > > >> > >> > >> > >    like percent_request_time and be in
> > {0,...100}
> > > or
> > > > > > > >> > >> > >> request_time_ratio
> > > > > > > >> > >> > >> > > and be
> > > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the
> > > > > terminology
> > > > > > we
> > > > > > > >> used
> > > > > > > >> > >> if
> > > > > > > >> > >> > the
> > > > > > > >> > >> > >> > > scale
> > > > > > > >> > >> > >> > >    is between 0 and 1 in the other metrics,
> > > right?)
> > > > > > > >> > >> > >> > >
> > > > > > > >> > >> > >> > > -Jay
> > > > > > > >> > >> > >> > >
> > > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini
> Sivaram
> > <
> > > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > > >> > >> > >> > >
> > > > > > > >> > >> > >> > > wrote:
> > > > > > > >> > >> > >> > >
> > > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > > >> > >> > >> > > >
> > > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > > >> > >> > >> > > >
> > > > > > > >> > >> > >> > > > Guozhang : I have updated the section on
> > > > > co-existence
> > > > > > of
> > > > > > > >> byte
> > > > > > > >> > >> rate
> > > > > > > >> > >> > >> and
> > > > > > > >> > >> > >> > > > request time quotas.
> > > > > > > >> > >> > >> > > >
> > > > > > > >> > >> > >> > > > Dong: I hadn't added much detail to the
> > metrics
> > > > and
> > > > > > > >> sensors
> > > > > > > >> > >> since
> > > > > > > >> > >> > >> they
> > > > > > > >> > >> > >> > > are
> > > > > > > >> > >> > >> > > > going to be very similar to the existing
> > metrics
> > > > and
> > > > > > > >> sensors.
> > > > > > > >> > >> To
> > > > > > > >> > >> > >> avoid
> > > > > > > >> > >> > >> > > > confusion, I have now added more detail. All
> > > > metrics
> > > > > > are
> > > > > > > >> in
> > > > > > > >> > the
> > > > > > > >> > >> > >> group
> > > > > > > >> > >> > >> > > > "quotaType" and all sensors have names
> > starting
> > > > with
> > > > > > > >> > >> "quotaType"
> > > > > > > >> > >> > >> (where
> > > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/
> LeaderReplication/
> > > > > > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > > > > > >> > >> > >> > > > So there will be no reuse of existing
> > > > > metrics/sensors.
> > > > > > > The
> > > > > > > >> > new
> > > > > > > >> > >> > ones
> > > > > > > >> > >> > >> for
> > > > > > > >> > >> > >> > > > request processing time based throttling
> will
> > be
> > > > > > > >> completely
> > > > > > > >> > >> > >> independent
> > > > > > > >> > >> > >> > > of
> > > > > > > >> > >> > >> > > > existing metrics/sensors, but will be
> > consistent
> > > > in
> > > > > > > >> format.
> > > > > > > >> > >> > >> > > >
> > > > > > > >> > >> > >> > > > The existing throttle_time_ms field in
> > > > produce/fetch
> > > > > > > >> > responses
> > > > > > > >> > >> > will
> > > > > > > >> > >> > >> not
> > > > > > > >> > >> > >> > > be
> > > > > > > >> > >> > >> > > > impacted by this KIP. That will continue to
> > > return
> > > > > > > >> byte-rate
> > > > > > > >> > >> based
> > > > > > > >> > >> > >> > > > throttling times. In addition, a new field
> > > > > > > >> > >> > request_throttle_time_ms
> > > > > > > >> > >> > >> > will
> > > > > > > >> > >> > >> > > be
> > > > > > > >> > >> > >> > > > added to return request quota based
> throttling
> > > > > times.
> > > > > > > >> These
> > > > > > > >> > >> will
> > > > > > > >> > >> > be
> > > > > > > >> > >> > >> > > exposed
> > > > > > > >> > >> > >> > > > as new metrics on the client-side.
> > > > > > > >> > >> > >> > > >
> > > > > > > >> > >> > >> > > > Since all metrics and sensors are different
> > for
> > > > each
> > > > > > > type
> > > > > > > >> of
> > > > > > > >> > >> > quota,
> > > > > > > >> > >> > >> I
> > > > > > > >> > >> > >> > > > believe there is already sufficient metrics
> to
> > > > > monitor
> > > > > > > >> > >> throttling
> > > > > > > >> > >> > on
> > > > > > > >> > >> > >> > both
> > > > > > > >> > >> > >> > > > client and broker side for each type of
> > > > throttling.
> > > > > > > >> > >> > >> > > >
> > > > > > > >> > >> > >> > > > Regards,
> > > > > > > >> > >> > >> > > >
> > > > > > > >> > >> > >> > > > Rajini
> > > > > > > >> > >> > >> > > >
> > > > > > > >> > >> > >> > > >
> > > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > > > > > > >> > lindong28@gmail.com
> > > > > > > >> > >> >
> > > > > > > >> > >> > >> wrote:
> > > > > > > >> > >> > >> > > >
> > > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > > I think it makes a lot of sense to use
> > > > > > io_thread_units
> > > > > > > >> as
> > > > > > > >> > >> metric
> > > > > > > >> > >> > >> to
> > > > > > > >> > >> > >> > > quota
> > > > > > > >> > >> > >> > > > > user's traffic here. LGTM overall. I have
> > some
> > > > > > > questions
> > > > > > > >> > >> > regarding
> > > > > > > >> > >> > >> > > > sensors.
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > > - Can you be more specific in the KIP what
> > > > sensors
> > > > > > > will
> > > > > > > >> be
> > > > > > > >> > >> > added?
> > > > > > > >> > >> > >> For
> > > > > > > >> > >> > >> > > > > example, it will be useful to specify the
> > name
> > > > and
> > > > > > > >> > >> attributes of
> > > > > > > >> > >> > >> > these
> > > > > > > >> > >> > >> > > > new
> > > > > > > >> > >> > >> > > > > sensors.
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > > - We currently have throttle-time and
> > > queue-size
> > > > > for
> > > > > > > >> > >> byte-rate
> > > > > > > >> > >> > >> based
> > > > > > > >> > >> > >> > > > quota.
> > > > > > > >> > >> > >> > > > > Are you going to have separate
> throttle-time
> > > and
> > > > > > > >> queue-size
> > > > > > > >> > >> for
> > > > > > > >> > >> > >> > > requests
> > > > > > > >> > >> > >> > > > > throttled by io_thread_unit-based quota,
> or
> > > will
> > > > > > they
> > > > > > > >> share
> > > > > > > >> > >> the
> > > > > > > >> > >> > >> same
> > > > > > > >> > >> > >> > > > > sensor?
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > > - Does the throttle-time in the
> > > ProduceResponse
> > > > > and
> > > > > > > >> > >> > FetchResponse
> > > > > > > >> > >> > >> > > > contains
> > > > > > > >> > >> > >> > > > > time due to io_thread_unit-based quota?
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > > - Currently kafka server doesn't not
> provide
> > > any
> > > > > log
> > > > > > > or
> > > > > > > >> > >> metrics
> > > > > > > >> > >> > >> that
> > > > > > > >> > >> > >> > > > tells
> > > > > > > >> > >> > >> > > > > whether any given clientId (or user) is
> > > > throttled.
> > > > > > > This
> > > > > > > >> is
> > > > > > > >> > >> not
> > > > > > > >> > >> > too
> > > > > > > >> > >> > >> > bad
> > > > > > > >> > >> > >> > > > > because we can still check the client-side
> > > > > byte-rate
> > > > > > > >> metric
> > > > > > > >> > >> to
> > > > > > > >> > >> > >> > validate
> > > > > > > >> > >> > >> > > > > whether a given client is throttled. But
> > with
> > > > this
> > > > > > > >> > >> > io_thread_unit,
> > > > > > > >> > >> > >> > > there
> > > > > > > >> > >> > >> > > > > will be no way to validate whether a given
> > > > client
> > > > > is
> > > > > > > >> slow
> > > > > > > >> > >> > because
> > > > > > > >> > >> > >> it
> > > > > > > >> > >> > >> > > has
> > > > > > > >> > >> > >> > > > > exceeded its io_thread_unit limit. It is
> > > > necessary
> > > > > > for
> > > > > > > >> user
> > > > > > > >> > >> to
> > > > > > > >> > >> > be
> > > > > > > >> > >> > >> > able
> > > > > > > >> > >> > >> > > to
> > > > > > > >> > >> > >> > > > > know this information to figure how
> whether
> > > they
> > > > > > have
> > > > > > > >> > reached
> > > > > > > >> > >> > >> there
> > > > > > > >> > >> > >> > > quota
> > > > > > > >> > >> > >> > > > > limit. How about we add log4j log on the
> > > server
> > > > > side
> > > > > > > to
> > > > > > > >> > >> > >> periodically
> > > > > > > >> > >> > >> > > > print
> > > > > > > >> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > > >> > >> > >> > > so
> > > > > > > >> > >> > >> > > > > that kafka administrator can figure those
> > > users
> > > > > that
> > > > > > > >> have
> > > > > > > >> > >> > reached
> > > > > > > >> > >> > >> > their
> > > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > > Thanks,
> > > > > > > >> > >> > >> > > > > Dong
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang
> > > Wang <
> > > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > > >> > >> > >> > > > wrote:
> > > > > > > >> > >> > >> > > > >
> > > > > > > >> > >> > >> > > > > > Made a pass over the doc, overall LGTM
> > > except
> > > > a
> > > > > > > minor
> > > > > > > >> > >> comment
> > > > > > > >> > >> > on
> > > > > > > >> > >> > >> > the
> > > > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > > > >> > >> > >> > > > > >
> > > > > > > >> > >> > >> > > > > > Stated as "Request processing time
> > > throttling
> > > > > will
> > > > > > > be
> > > > > > > >> > >> applied
> > > > > > > >> > >> > on
> > > > > > > >> > >> > >> > top
> > > > > > > >> > >> > >> > > if
> > > > > > > >> > >> > >> > > > > > necessary." I thought that it meant the
> > > > request
> > > > > > > >> > processing
> > > > > > > >> > >> > time
> > > > > > > >> > >> > >> > > > > throttling
> > > > > > > >> > >> > >> > > > > > is applied first, but continue reading I
> > > found
> > > > > it
> > > > > > > >> > actually
> > > > > > > >> > >> > >> meant to
> > > > > > > >> > >> > >> > > > apply
> > > > > > > >> > >> > >> > > > > > produce / fetch byte rate throttling
> > first.
> > > > > > > >> > >> > >> > > > > >
> > > > > > > >> > >> > >> > > > > > Also the last sentence "The remaining
> > delay
> > > if
> > > > > any
> > > > > > > is
> > > > > > > >> > >> applied
> > > > > > > >> > >> > to
> > > > > > > >> > >> > >> > the
> > > > > > > >> > >> > >> > > > > > response." is a bit confusing to me.
> Maybe
> > > > > > rewording
> > > > > > > >> it a
> > > > > > > >> > >> bit?
> > > > > > > >> > >> > >> > > > > >
> > > > > > > >> > >> > >> > > > > >
> > > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > > >> > >> > >> > > > > >
> > > > > > > >> > >> > >> > > > > >
> > > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun
> Rao <
> > > > > > > >> > jun@confluent.io
> > > > > > > >> > >> >
> > > > > > > >> > >> > >> wrote:
> > > > > > > >> > >> > >> > > > > >
> > > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > > >> > >> > >> > > > > > >
> > > > > > > >> > >> > >> > > > > > > Thanks for the updated KIP. The latest
> > > > > proposal
> > > > > > > >> looks
> > > > > > > >> > >> good
> > > > > > > >> > >> > to
> > > > > > > >> > >> > >> me.
> > > > > > > >> > >> > >> > > > > > >
> > > > > > > >> > >> > >> > > > > > > Jun
> > > > > > > >> > >> > >> > > > > > >
> > > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM,
> Rajini
> > > > > Sivaram
> > > > > > <
> > > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > > >> > >> > >> > > > > > >
> > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > >> > >> > >> > > > > > >
> > > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > > >> > >> > >> > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > Thank you for the feedback.
> > > > > > > >> > >> > >> > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > 1. I have updated the KIP to use
> > > absolute
> > > > > > units
> > > > > > > >> > >> instead of
> > > > > > > >> > >> > >> > > > > percentage.
> > > > > > > >> > >> > >> > > > > > > The
> > > > > > > >> > >> > >> > > > > > > > property is called* io_thread_units*
> > to
> > > > > align
> > > > > > > with
> > > > > > > >> > the
> > > > > > > >> > >> > >> thread
> > > > > > > >> > >> > >> > > count
> > > > > > > >> > >> > >> > > > > > > > property *num.io.threads*. When we
> > > > implement
> > > > > > > >> network
> > > > > > > >> > >> > thread
> > > > > > > >> > >> > >> > > > > utilization
> > > > > > > >> > >> > >> > > > > > > > quotas, we can add another property
> > > > > > > >> > >> > *network_thread_units.*
> > > > > > > >> > >> > >> > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is already
> > listed
> > > > > under
> > > > > > > the
> > > > > > > >> > >> exempt
> > > > > > > >> > >> > >> > > requests.
> > > > > > > >> > >> > >> > > > > Jun,
> > > > > > > >> > >> > >> > > > > > > did
> > > > > > > >> > >> > >> > > > > > > > you mean a different request that
> > needs
> > > to
> > > > > be
> > > > > > > >> added?
> > > > > > > >> > >> The
> > > > > > > >> > >> > >> four
> > > > > > > >> > >> > >> > > > > requests
> > > > > > > >> > >> > >> > > > > > > > currently exempt in the KIP are
> > > > StopReplica,
> > > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata.
> These
> > > are
> > > > > > > >> controlled
> > > > > > > >> > >> > using
> > > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > > >> > >> > >> > > > > > > > ACL, so it is easy to exclude and
> only
> > > > > > throttle
> > > > > > > if
> > > > > > > >> > >> > >> > unauthorized.
> > > > > > > >> > >> > >> > > I
> > > > > > > >> > >> > >> > > > > > wasn't
> > > > > > > >> > >> > >> > > > > > > > sure if there are other requests
> used
> > > only
> > > > > for
> > > > > > > >> > >> > inter-broker
> > > > > > > >> > >> > >> > that
> > > > > > > >> > >> > >> > > > > needed
> > > > > > > >> > >> > >> > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > > >> > >> > >> > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > 3. I was thinking the smallest
> change
> > > > would
> > > > > be
> > > > > > > to
> > > > > > > >> > >> replace
> > > > > > > >> > >> > >> all
> > > > > > > >> > >> > >> > > > > > references
> > > > > > > >> > >> > >> > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > *requestChannel.sendResponse()*
> with
> > a
> > > > > local
> > > > > > > >> method
> > > > > > > >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that
> > does
> > > > the
> > > > > > > >> > throttling
> > > > > > > >> > >> if
> > > > > > > >> > >> > >> any
> > > > > > > >> > >> > >> > > plus
> > > > > > > >> > >> > >> > > > > send
> > > > > > > >> > >> > >> > > > > > > > response. If we throttle first in
> > > > > > > >> > *KafkaApis.handle()*,
> > > > > > > >> > >> > the
> > > > > > > >> > >> > >> > time
> > > > > > > >> > >> > >> > > > > spent
> > > > > > > >> > >> > >> > > > > > > > within the method handling the
> request
> > > > will
> > > > > > not
> > > > > > > be
> > > > > > > >> > >> > recorded
> > > > > > > >> > >> > >> or
> > > > > > > >> > >> > >> > > used
> > > > > > > >> > >> > >> > > > > in
> > > > > > > >> > >> > >> > > > > > > > throttling. We can look into this
> > again
> > > > when
> > > > > > the
> > > > > > > >> PR
> > > > > > > >> > is
> > > > > > > >> > >> > ready
> > > > > > > >> > >> > >> > for
> > > > > > > >> > >> > >> > > > > > review.
> > > > > > > >> > >> > >> > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > > >> > >> > >> > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > > >> > >> > >> > > > > > > >
> > > > > > > >> > >> > >> > > > > > > >
> > > > > > > >> > >> > >> > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM,
> Roger
> > > > > Hoover
> > > > > > <
> > > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > > >> > >> > >> > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > Great to see this KIP and the
> > > excellent
> > > > > > > >> discussion.
> > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > To me, Jun's suggestion makes
> sense.
> > > If
> > > > > my
> > > > > > > >> > >> application
> > > > > > > >> > >> > is
> > > > > > > >> > >> > >> > > > > allocated
> > > > > > > >> > >> > >> > > > > > 1
> > > > > > > >> > >> > >> > > > > > > > > request handler unit, then it's as
> > if
> > > I
> > > > > > have a
> > > > > > > >> > Kafka
> > > > > > > >> > >> > >> broker
> > > > > > > >> > >> > >> > > with
> > > > > > > >> > >> > >> > > > a
> > > > > > > >> > >> > >> > > > > > > single
> > > > > > > >> > >> > >> > > > > > > > > request handler thread dedicated
> to
> > > me.
> > > > > > > That's
> > > > > > > >> the
> > > > > > > >> > >> > most I
> > > > > > > >> > >> > >> > can
> > > > > > > >> > >> > >> > > > use,
> > > > > > > >> > >> > >> > > > > > at
> > > > > > > >> > >> > >> > > > > > > > > least.  That allocation doesn't
> > change
> > > > > even
> > > > > > if
> > > > > > > >> an
> > > > > > > >> > >> admin
> > > > > > > >> > >> > >> later
> > > > > > > >> > >> > >> > > > > > increases
> > > > > > > >> > >> > >> > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > size of the request thread pool on
> > the
> > > > > > broker.
> > > > > > > >> > It's
> > > > > > > >> > >> > >> similar
> > > > > > > >> > >> > >> > to
> > > > > > > >> > >> > >> > > > the
> > > > > > > >> > >> > >> > > > > > CPU
> > > > > > > >> > >> > >> > > > > > > > > abstraction that VMs and
> containers
> > > get
> > > > > from
> > > > > > > >> > >> hypervisors
> > > > > > > >> > >> > >> or
> > > > > > > >> > >> > >> > OS
> > > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > > >> > >> > >> > > > > > > > > While different client access
> > patterns
> > > > can
> > > > > > use
> > > > > > > >> > wildly
> > > > > > > >> > >> > >> > different
> > > > > > > >> > >> > >> > > > > > amounts
> > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > >> > >> > >> > > > > > > > > request thread resources per
> > request,
> > > a
> > > > > > given
> > > > > > > >> > >> > application
> > > > > > > >> > >> > >> > will
> > > > > > > >> > >> > >> > > > > > > generally
> > > > > > > >> > >> > >> > > > > > > > > have a stable access pattern and
> can
> > > > > figure
> > > > > > > out
> > > > > > > >> > >> > >> empirically
> > > > > > > >> > >> > >> > how
> > > > > > > >> > >> > >> > > > > many
> > > > > > > >> > >> > >> > > > > > > > > "request thread units" it needs to
> > > meet
> > > > > it's
> > > > > > > >> > >> > >> > throughput/latency
> > > > > > > >> > >> > >> > > > > > goals.
> > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM,
> Jun
> > > > Rao <
> > > > > > > >> > >> > >> jun@confluent.io>
> > > > > > > >> > >> > >> > > > wrote:
> > > > > > > >> > >> > >> > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > Thanks for the updated KIP. A
> few
> > > more
> > > > > > > >> comments.
> > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> > request_time_percent
> > > > is
> > > > > > that
> > > > > > > >> it's
> > > > > > > >> > >> not
> > > > > > > >> > >> > an
> > > > > > > >> > >> > >> > > > absolute
> > > > > > > >> > >> > >> > > > > > > > value.
> > > > > > > >> > >> > >> > > > > > > > > > Let's say you give a user a 10%
> > > limit.
> > > > > If
> > > > > > > the
> > > > > > > >> > admin
> > > > > > > >> > >> > >> doubles
> > > > > > > >> > >> > >> > > the
> > > > > > > >> > >> > >> > > > > > > number
> > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > >> > >> > >> > > > > > > > > > request handler threads, that
> user
> > > now
> > > > > > > >> actually
> > > > > > > >> > has
> > > > > > > >> > >> > >> twice
> > > > > > > >> > >> > >> > the
> > > > > > > >> > >> > >> > > > > > > absolute
> > > > > > > >> > >> > >> > > > > > > > > > capacity. This may confuse
> people
> > a
> > > > bit.
> > > > > > So,
> > > > > > > >> > >> perhaps
> > > > > > > >> > >> > >> > setting
> > > > > > > >> > >> > >> > > > the
> > > > > > > >> > >> > >> > > > > > > quota
> > > > > > > >> > >> > >> > > > > > > > > > based on an absolute request
> > thread
> > > > unit
> > > > > > is
> > > > > > > >> > better.
> > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is
> > also
> > > > an
> > > > > > > >> > >> inter-broker
> > > > > > > >> > >> > >> > request
> > > > > > > >> > >> > >> > > > and
> > > > > > > >> > >> > >> > > > > > > needs
> > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > > > be excluded from throttling.
> > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > 3. Implementation wise, I am
> > > wondering
> > > > > if
> > > > > > > it's
> > > > > > > >> > >> simpler
> > > > > > > >> > >> > >> to
> > > > > > > >> > >> > >> > > apply
> > > > > > > >> > >> > >> > > > > the
> > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > >> > >> > >> > > > > > > > > > time throttling first in
> > > > > > KafkaApis.handle().
> > > > > > > >> > >> > Otherwise,
> > > > > > > >> > >> > >> we
> > > > > > > >> > >> > >> > > will
> > > > > > > >> > >> > >> > > > > > need
> > > > > > > >> > >> > >> > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > > add
> > > > > > > >> > >> > >> > > > > > > > > > the throttling logic in each
> type
> > of
> > > > > > > request.
> > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM,
> > > > Rajini
> > > > > > > >> Sivaram <
> > > > > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > > >> > >> > >> > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > Thank you for the review.
> > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > I have reverted to the
> original
> > > KIP
> > > > > that
> > > > > > > >> > >> throttles
> > > > > > > >> > >> > >> based
> > > > > > > >> > >> > >> > on
> > > > > > > >> > >> > >> > > > > > request
> > > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > > >> > >> > >> > > > > > > > > > > utilization. At the moment, it
> > > uses
> > > > > > > >> percentage,
> > > > > > > >> > >> but
> > > > > > > >> > >> > I
> > > > > > > >> > >> > >> am
> > > > > > > >> > >> > >> > > > happy
> > > > > > > >> > >> > >> > > > > to
> > > > > > > >> > >> > >> > > > > > > > > change
> > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > > > > a fraction (out of 1 instead
> of
> > > 100)
> > > > > if
> > > > > > > >> > >> required. I
> > > > > > > >> > >> > >> have
> > > > > > > >> > >> > >> > > > added
> > > > > > > >> > >> > >> > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > > >> > >> > >> > > > > > > > > > > from this discussion to the
> KIP.
> > > > Also
> > > > > > > added
> > > > > > > >> a
> > > > > > > >> > >> > "Future
> > > > > > > >> > >> > >> > Work"
> > > > > > > >> > >> > >> > > > > > section
> > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > > > > address network thread
> > > utilization.
> > > > > The
> > > > > > > >> > >> > configuration
> > > > > > > >> > >> > >> is
> > > > > > > >> > >> > >> > > > named
> > > > > > > >> > >> > >> > > > > > > > > > > "request_time_percent" with
> the
> > > > > > > expectation
> > > > > > > >> > that
> > > > > > > >> > >> it
> > > > > > > >> > >> > >> can
> > > > > > > >> > >> > >> > > also
> > > > > > > >> > >> > >> > > > be
> > > > > > > >> > >> > >> > > > > > > used
> > > > > > > >> > >> > >> > > > > > > > as
> > > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > limit for network thread
> > > utilization
> > > > > > when
> > > > > > > >> that
> > > > > > > >> > is
> > > > > > > >> > >> > >> > > > implemented,
> > > > > > > >> > >> > >> > > > > so
> > > > > > > >> > >> > >> > > > > > > > that
> > > > > > > >> > >> > >> > > > > > > > > > > users have to set only one
> > config
> > > > for
> > > > > > the
> > > > > > > >> two
> > > > > > > >> > and
> > > > > > > >> > >> > not
> > > > > > > >> > >> > >> > have
> > > > > > > >> > >> > >> > > to
> > > > > > > >> > >> > >> > > > > > worry
> > > > > > > >> > >> > >> > > > > > > > > about
> > > > > > > >> > >> > >> > > > > > > > > > > the internal distribution of
> the
> > > > work
> > > > > > > >> between
> > > > > > > >> > the
> > > > > > > >> > >> > two
> > > > > > > >> > >> > >> > > thread
> > > > > > > >> > >> > >> > > > > > pools
> > > > > > > >> > >> > >> > > > > > > in
> > > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23
> > AM,
> > > > Jun
> > > > > > Rao
> > > > > > > <
> > > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > The benefit of using the
> > request
> > > > > > > >> processing
> > > > > > > >> > >> time
> > > > > > > >> > >> > >> over
> > > > > > > >> > >> > >> > the
> > > > > > > >> > >> > >> > > > > > request
> > > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > >> > >> > >> > > > > > > > > > > > exactly what people have
> > said. I
> > > > > will
> > > > > > > just
> > > > > > > >> > >> expand
> > > > > > > >> > >> > >> that
> > > > > > > >> > >> > >> > a
> > > > > > > >> > >> > >> > > > bit.
> > > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > > following case. The producer
> > > > sends a
> > > > > > > >> produce
> > > > > > > >> > >> > request
> > > > > > > >> > >> > >> > > with a
> > > > > > > >> > >> > >> > > > > > 10MB
> > > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > > >> > >> > >> > > > > > > > > > > > but compressed to 100KB with
> > > gzip.
> > > > > The
> > > > > > > >> > >> > >> decompression of
> > > > > > > >> > >> > >> > > the
> > > > > > > >> > >> > >> > > > > > > message
> > > > > > > >> > >> > >> > > > > > > > > on
> > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > > broker could take 10-15
> > seconds,
> > > > > > during
> > > > > > > >> which
> > > > > > > >> > >> > time,
> > > > > > > >> > >> > >> a
> > > > > > > >> > >> > >> > > > request
> > > > > > > >> > >> > >> > > > > > > > handler
> > > > > > > >> > >> > >> > > > > > > > > > > > thread is completely
> blocked.
> > In
> > > > > this
> > > > > > > >> case,
> > > > > > > >> > >> > neither
> > > > > > > >> > >> > >> the
> > > > > > > >> > >> > >> > > > > byte-in
> > > > > > > >> > >> > >> > > > > > > > quota
> > > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > > >> > >> > >> > > > > > > > > > > > the request rate quota may
> be
> > > > > > effective
> > > > > > > in
> > > > > > > >> > >> > >> protecting
> > > > > > > >> > >> > >> > the
> > > > > > > >> > >> > >> > > > > > broker.
> > > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > > >> > >> > >> > > > > > > > > > > > another case. A consumer
> group
> > > > > starts
> > > > > > > >> with 10
> > > > > > > >> > >> > >> instances
> > > > > > > >> > >> > >> > > and
> > > > > > > >> > >> > >> > > > > > later
> > > > > > > >> > >> > >> > > > > > > > on
> > > > > > > >> > >> > >> > > > > > > > > > > > switches to 20 instances.
> The
> > > > > request
> > > > > > > rate
> > > > > > > >> > will
> > > > > > > >> > >> > >> likely
> > > > > > > >> > >> > >> > > > > double,
> > > > > > > >> > >> > >> > > > > > > but
> > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > > actually load on the broker
> > may
> > > > not
> > > > > > > double
> > > > > > > >> > >> since
> > > > > > > >> > >> > >> each
> > > > > > > >> > >> > >> > > fetch
> > > > > > > >> > >> > >> > > > > > > request
> > > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > > >> > >> > >> > > > > > > > > > > > contains half of the
> > partitions.
> > > > > > Request
> > > > > > > >> rate
> > > > > > > >> > >> > quota
> > > > > > > >> > >> > >> may
> > > > > > > >> > >> > >> > > not
> > > > > > > >> > >> > >> > > > > be
> > > > > > > >> > >> > >> > > > > > > easy
> > > > > > > >> > >> > >> > > > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > > > > > configure in this case.
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > What we really want is to be
> > > able
> > > > to
> > > > > > > >> prevent
> > > > > > > >> > a
> > > > > > > >> > >> > >> client
> > > > > > > >> > >> > >> > > from
> > > > > > > >> > >> > >> > > > > > using
> > > > > > > >> > >> > >> > > > > > > > too
> > > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > > >> > >> > >> > > > > > > > > > > > of the server side
> resources.
> > In
> > > > > this
> > > > > > > >> > >> particular
> > > > > > > >> > >> > >> KIP,
> > > > > > > >> > >> > >> > > this
> > > > > > > >> > >> > >> > > > > > > resource
> > > > > > > >> > >> > >> > > > > > > > > is
> > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > > capacity of the request
> > handler
> > > > > > > threads. I
> > > > > > > >> > >> agree
> > > > > > > >> > >> > >> that
> > > > > > > >> > >> > >> > it
> > > > > > > >> > >> > >> > > > may
> > > > > > > >> > >> > >> > > > > > not
> > > > > > > >> > >> > >> > > > > > > be
> > > > > > > >> > >> > >> > > > > > > > > > > > intuitive for the users to
> > > > determine
> > > > > > how
> > > > > > > >> to
> > > > > > > >> > set
> > > > > > > >> > >> > the
> > > > > > > >> > >> > >> > right
> > > > > > > >> > >> > >> > > > > > limit.
> > > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > > >> > >> > >> > > > > > > > > > > > this is not completely new
> and
> > > has
> > > > > > been
> > > > > > > >> done
> > > > > > > >> > in
> > > > > > > >> > >> > the
> > > > > > > >> > >> > >> > > > container
> > > > > > > >> > >> > >> > > > > > > world
> > > > > > > >> > >> > >> > > > > > > > > > > > already. For example, Linux
> > > > cgroup (
> > > > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > > > >> > >> > >> > > > > > > > > > > >
> documentation/en-US/Red_Hat_En
> > > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > > >> > >> > >> > > > > > > > > > > >
> Resource_Management_Guide/sec-
> > > > > > cpu.html)
> > > > > > > >> has
> > > > > > > >> > >> the
> > > > > > > >> > >> > >> > concept
> > > > > > > >> > >> > >> > > of
> > > > > > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > > > >> > >> > >> > > > > > > > > > > > which specifies the total
> > amount
> > > > of
> > > > > > time
> > > > > > > >> in
> > > > > > > >> > >> > >> > microseconds
> > > > > > > >> > >> > >> > > > for
> > > > > > > >> > >> > >> > > > > > > which
> > > > > > > >> > >> > >> > > > > > > > > all
> > > > > > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup can run
> > > during a
> > > > > one
> > > > > > > >> second
> > > > > > > >> > >> > >> period.
> > > > > > > >> > >> > >> > We
> > > > > > > >> > >> > >> > > > can
> > > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > > >> > >> > >> > > > > > > > > > > > model the request handler
> > > threads
> > > > > in a
> > > > > > > >> > similar
> > > > > > > >> > >> > way.
> > > > > > > >> > >> > >> For
> > > > > > > >> > >> > >> > > > > > example,
> > > > > > > >> > >> > >> > > > > > > > each
> > > > > > > >> > >> > >> > > > > > > > > > > > request handler thread can
> be
> > 1
> > > > > > request
> > > > > > > >> > handler
> > > > > > > >> > >> > unit
> > > > > > > >> > >> > >> > and
> > > > > > > >> > >> > >> > > > the
> > > > > > > >> > >> > >> > > > > > > admin
> > > > > > > >> > >> > >> > > > > > > > > can
> > > > > > > >> > >> > >> > > > > > > > > > > > configure a limit on how
> many
> > > > units
> > > > > > (say
> > > > > > > >> > 0.01)
> > > > > > > >> > >> a
> > > > > > > >> > >> > >> client
> > > > > > > >> > >> > >> > > can
> > > > > > > >> > >> > >> > > > > > have.
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > Regarding not throttling the
> > > > > internal
> > > > > > > >> broker
> > > > > > > >> > to
> > > > > > > >> > >> > >> broker
> > > > > > > >> > >> > >> > > > > > requests.
> > > > > > > >> > >> > >> > > > > > > We
> > > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > > >> > >> > >> > > > > > > > > > > > do that. Alternatively, we
> > could
> > > > > just
> > > > > > > let
> > > > > > > >> the
> > > > > > > >> > >> > admin
> > > > > > > >> > >> > >> > > > > configure a
> > > > > > > >> > >> > >> > > > > > > > high
> > > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > > >> > >> > >> > > > > > > > > > > > for the kafka user (it may
> not
> > > be
> > > > > able
> > > > > > > to
> > > > > > > >> do
> > > > > > > >> > >> that
> > > > > > > >> > >> > >> > easily
> > > > > > > >> > >> > >> > > > > based
> > > > > > > >> > >> > >> > > > > > on
> > > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > Ideally we want to be able
> to
> > > > > protect
> > > > > > > the
> > > > > > > >> > >> > >> utilization
> > > > > > > >> > >> > >> > of
> > > > > > > >> > >> > >> > > > the
> > > > > > > >> > >> > >> > > > > > > > network
> > > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > > >> > >> > >> > > > > > > > > > > > pool too. The difficult is
> > > mostly
> > > > > what
> > > > > > > >> Rajini
> > > > > > > >> > >> > said:
> > > > > > > >> > >> > >> (1)
> > > > > > > >> > >> > >> > > The
> > > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > >> > >> > >> > > > > > > > > > > > throttling the requests is
> > > through
> > > > > > > >> Purgatory
> > > > > > > >> > >> and
> > > > > > > >> > >> > we
> > > > > > > >> > >> > >> > will
> > > > > > > >> > >> > >> > > > have
> > > > > > > >> > >> > >> > > > > > to
> > > > > > > >> > >> > >> > > > > > > > > think
> > > > > > > >> > >> > >> > > > > > > > > > > > through how to integrate
> that
> > > into
> > > > > the
> > > > > > > >> > network
> > > > > > > >> > >> > >> layer.
> > > > > > > >> > >> > >> > > (2)
> > > > > > > >> > >> > >> > > > In
> > > > > > > >> > >> > >> > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > > >> > >> > >> > > > > > > > > > > > layer, currently we know the
> > > user,
> > > > > but
> > > > > > > not
> > > > > > > >> > the
> > > > > > > >> > >> > >> clientId
> > > > > > > >> > >> > >> > > of
> > > > > > > >> > >> > >> > > > > the
> > > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > > >> > >> > >> > > > > > > > > > > > it's a bit tricky to
> throttle
> > > > based
> > > > > on
> > > > > > > >> > clientId
> > > > > > > >> > >> > >> there.
> > > > > > > >> > >> > >> > > > Plus,
> > > > > > > >> > >> > >> > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > > >> > >> > >> > > > > > > > > > > > quota can already protect
> the
> > > > > network
> > > > > > > >> thread
> > > > > > > >> > >> > >> > utilization
> > > > > > > >> > >> > >> > > > for
> > > > > > > >> > >> > >> > > > > > > fetch
> > > > > > > >> > >> > >> > > > > > > > > > > > requests. So, if we can't
> > figure
> > > > out
> > > > > > > this
> > > > > > > >> > part
> > > > > > > >> > >> > right
> > > > > > > >> > >> > >> > now,
> > > > > > > >> > >> > >> > > > > just
> > > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > >> > >> > >> > > > > > > > > > > > the request handling threads
> > for
> > > > > this
> > > > > > > KIP
> > > > > > > >> is
> > > > > > > >> > >> > still a
> > > > > > > >> > >> > >> > > useful
> > > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27
> > AM,
> > > > > > Rajini
> > > > > > > >> > >> Sivaram <
> > > > > > > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > Thank you all for the
> > > feedback.
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have removed
> > exemption
> > > > for
> > > > > > > >> consumer
> > > > > > > >> > >> > >> heartbeat
> > > > > > > >> > >> > >> > > etc.
> > > > > > > >> > >> > >> > > > > > Agree
> > > > > > > >> > >> > >> > > > > > > > > that
> > > > > > > >> > >> > >> > > > > > > > > > > > > protecting the cluster is
> > more
> > > > > > > important
> > > > > > > >> > than
> > > > > > > >> > >> > >> > > protecting
> > > > > > > >> > >> > >> > > > > > > > individual
> > > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > > >> > >> > >> > > > > > > > > > > > > Have retained the
> exemption
> > > for
> > > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > > >> > >> > >> > > > > > etc,
> > > > > > > >> > >> > >> > > > > > > > > these
> > > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > > >> > >> > >> > > > > > > > > > > > > throttled only if
> > > authorization
> > > > > > fails
> > > > > > > >> (so
> > > > > > > >> > >> can't
> > > > > > > >> > >> > be
> > > > > > > >> > >> > >> > used
> > > > > > > >> > >> > >> > > > for
> > > > > > > >> > >> > >> > > > > > DoS
> > > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > > >> > >> > >> > > > > > > > > > > > > a secure cluster, but
> allows
> > > > > > > >> inter-broker
> > > > > > > >> > >> > >> requests to
> > > > > > > >> > >> > >> > > > > > complete
> > > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > I will wait another day to
> > see
> > > > if
> > > > > > > these
> > > > > > > >> is
> > > > > > > >> > >> any
> > > > > > > >> > >> > >> > > objection
> > > > > > > >> > >> > >> > > > to
> > > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > > >> > >> > >> > > > > > > > > > > > > request processing time
> (as
> > > > > opposed
> > > > > > to
> > > > > > > >> > >> request
> > > > > > > >> > >> > >> rate)
> > > > > > > >> > >> > >> > > and
> > > > > > > >> > >> > >> > > > if
> > > > > > > >> > >> > >> > > > > > > there
> > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > > >> > >> > >> > > > > > > > > > > > > objections, I will revert
> to
> > > the
> > > > > > > >> original
> > > > > > > >> > >> > proposal
> > > > > > > >> > >> > >> > with
> > > > > > > >> > >> > >> > > > > some
> > > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > The original proposal was
> > only
> > > > > > > including
> > > > > > > >> > the
> > > > > > > >> > >> > time
> > > > > > > >> > >> > >> > used
> > > > > > > >> > >> > >> > > by
> > > > > > > >> > >> > >> > > > > the
> > > > > > > >> > >> > >> > > > > > > > > request
> > > > > > > >> > >> > >> > > > > > > > > > > > > handler threads (that made
> > > > > > calculation
> > > > > > > >> > >> easy). I
> > > > > > > >> > >> > >> think
> > > > > > > >> > >> > >> > > the
> > > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > > > > > > include the time spent in
> > the
> > > > > > network
> > > > > > > >> > >> threads as
> > > > > > > >> > >> > >> well
> > > > > > > >> > >> > >> > > > since
> > > > > > > >> > >> > >> > > > > > > that
> > > > > > > >> > >> > >> > > > > > > > > may
> > > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > > >> > >> > >> > > > > > > > > > > > > significant. As Jay
> pointed
> > > out,
> > > > > it
> > > > > > is
> > > > > > > >> more
> > > > > > > >> > >> > >> > complicated
> > > > > > > >> > >> > >> > > > to
> > > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > > > total available CPU time
> and
> > > > > convert
> > > > > > > to
> > > > > > > >> a
> > > > > > > >> > >> ratio
> > > > > > > >> > >> > >> when
> > > > > > > >> > >> > >> > > > there
> > > > > > > >> > >> > >> > > > > > *m*
> > > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > > >> > >> > >> > > > > > > > > > > > > and *n* network threads.
> > > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > > >> > >> > >> > > )
> > > > > > > >> > >> > >> > > > > may
> > > > > > > >> > >> > >> > > > > > > > give
> > > > > > > >> > >> > >> > > > > > > > > us
> > > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > > >> > >> > >> > > > > > > > > > > > > we want, but it can be
> very
> > > > > > expensive
> > > > > > > on
> > > > > > > >> > some
> > > > > > > >> > >> > >> > > platforms.
> > > > > > > >> > >> > >> > > > As
> > > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang have pointed out,
> > we
> > > do
> > > > > > have
> > > > > > > >> > several
> > > > > > > >> > >> > time
> > > > > > > >> > >> > >> > > > > > measurements
> > > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > >> > >> > >> > > > > > > > > > > > > generating metrics that we
> > > could
> > > > > > use,
> > > > > > > >> > though
> > > > > > > >> > >> we
> > > > > > > >> > >> > >> might
> > > > > > > >> > >> > >> > > > want
> > > > > > > >> > >> > >> > > > > to
> > > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime() instead of
> > > > > > > >> currentTimeMillis()
> > > > > > > >> > >> since
> > > > > > > >> > >> > >> some
> > > > > > > >> > >> > >> > of
> > > > > > > >> > >> > >> > > > the
> > > > > > > >> > >> > >> > > > > > > > values
> > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > >> > >> > >> > > > > > > > > > > > > small requests may be <
> 1ms.
> > > But
> > > > > > > rather
> > > > > > > >> > than
> > > > > > > >> > >> add
> > > > > > > >> > >> > >> up
> > > > > > > >> > >> > >> > the
> > > > > > > >> > >> > >> > > > > time
> > > > > > > >> > >> > >> > > > > > > > spent
> > > > > > > >> > >> > >> > > > > > > > > in
> > > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > > >> > >> > >> > > > > > > > > > > > > thread and network thread,
> > > > > wouldn't
> > > > > > it
> > > > > > > >> be
> > > > > > > >> > >> better
> > > > > > > >> > >> > >> to
> > > > > > > >> > >> > >> > > > convert
> > > > > > > >> > >> > >> > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > time
> > > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > > >> > >> > >> > > > > > > > > > > > > on each thread into a
> > separate
> > > > > > ratio?
> > > > > > > >> UserA
> > > > > > > >> > >> has
> > > > > > > >> > >> > a
> > > > > > > >> > >> > >> > > request
> > > > > > > >> > >> > >> > > > > > quota
> > > > > > > >> > >> > >> > > > > > > > of
> > > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > > >> > >> > >> > > > > > > > > > > > > we take that to mean that
> > > UserA
> > > > > can
> > > > > > > use
> > > > > > > >> 5%
> > > > > > > >> > of
> > > > > > > >> > >> > the
> > > > > > > >> > >> > >> > time
> > > > > > > >> > >> > >> > > on
> > > > > > > >> > >> > >> > > > > > > network
> > > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of the time on I/O
> > > > threads?
> > > > > > If
> > > > > > > >> > either
> > > > > > > >> > >> is
> > > > > > > >> > >> > >> > > exceeded,
> > > > > > > >> > >> > >> > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > > >> > >> > >> > > > > > > > > > > > > throttled - it would mean
> > > > > > maintaining
> > > > > > > >> two
> > > > > > > >> > >> sets
> > > > > > > >> > >> > of
> > > > > > > >> > >> > >> > > metrics
> > > > > > > >> > >> > >> > > > > for
> > > > > > > >> > >> > >> > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > two
> > > > > > > >> > >> > >> > > > > > > > > > > > > durations, but would
> result
> > in
> > > > > more
> > > > > > > >> > >> meaningful
> > > > > > > >> > >> > >> > ratios.
> > > > > > > >> > >> > >> > > We
> > > > > > > >> > >> > >> > > > > > could
> > > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > > >> > >> > >> > > > > > > > > > > > > quota limits (UserA has 5%
> > of
> > > > > > request
> > > > > > > >> > threads
> > > > > > > >> > >> > and
> > > > > > > >> > >> > >> 10%
> > > > > > > >> > >> > >> > > of
> > > > > > > >> > >> > >> > > > > > > network
> > > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > > >> > >> > >> > > > > > > > > > > > > but that seems unnecessary
> > and
> > > > > > harder
> > > > > > > to
> > > > > > > >> > >> explain
> > > > > > > >> > >> > >> to
> > > > > > > >> > >> > >> > > > users.
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > Back to why and how quotas
> > are
> > > > > > applied
> > > > > > > >> to
> > > > > > > >> > >> > network
> > > > > > > >> > >> > >> > > thread
> > > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > > >> > >> > >> > > > > > > > > > > > > a) In the case of fetch,
> > the
> > > > time
> > > > > > > >> spent in
> > > > > > > >> > >> the
> > > > > > > >> > >> > >> > network
> > > > > > > >> > >> > >> > > > > > thread
> > > > > > > >> > >> > >> > > > > > > > may
> > > > > > > >> > >> > >> > > > > > > > > be
> > > > > > > >> > >> > >> > > > > > > > > > > > > significant and I can see
> > the
> > > > need
> > > > > > to
> > > > > > > >> > include
> > > > > > > >> > >> > >> this.
> > > > > > > >> > >> > >> > Are
> > > > > > > >> > >> > >> > > > > there
> > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > >> > >> > >> > > > > > > > > > > > > requests where the network
> > > > thread
> > > > > > > >> > >> utilization is
> > > > > > > >> > >> > >> > > > > significant?
> > > > > > > >> > >> > >> > > > > > > In
> > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > > >> > >> > >> > > > > > > > > > > > > of fetch, request handler
> > > thread
> > > > > > > >> > utilization
> > > > > > > >> > >> > would
> > > > > > > >> > >> > >> > > > throttle
> > > > > > > >> > >> > >> > > > > > > > clients
> > > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > > >> > >> > >> > > > > > > > > > > > > high request rate, low
> data
> > > > volume
> > > > > > and
> > > > > > > >> > fetch
> > > > > > > >> > >> > byte
> > > > > > > >> > >> > >> > rate
> > > > > > > >> > >> > >> > > > > quota
> > > > > > > >> > >> > >> > > > > > > will
> > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > >> > >> > >> > > > > > > > > > > > > clients with high data
> > volume.
> > > > > > Network
> > > > > > > >> > thread
> > > > > > > >> > >> > >> > > utilization
> > > > > > > >> > >> > >> > > > > is
> > > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > > >> > >> > >> > > > > > > > > > > > > proportional to the data
> > > > volume. I
> > > > > > am
> > > > > > > >> > >> wondering
> > > > > > > >> > >> > >> if we
> > > > > > > >> > >> > >> > > > even
> > > > > > > >> > >> > >> > > > > > need
> > > > > > > >> > >> > >> > > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > > >> > >> > >> > > > > > > > > > > > > based on network thread
> > > > > utilization
> > > > > > or
> > > > > > > >> > >> whether
> > > > > > > >> > >> > the
> > > > > > > >> > >> > >> > data
> > > > > > > >> > >> > >> > > > > > volume
> > > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > b) At the moment, we
> record
> > > and
> > > > > > check
> > > > > > > >> for
> > > > > > > >> > >> quota
> > > > > > > >> > >> > >> > > violation
> > > > > > > >> > >> > >> > > > > at
> > > > > > > >> > >> > >> > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > same
> > > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > > >> > >> > >> > > > > > > > > > > > > If a quota is violated,
> the
> > > > > response
> > > > > > > is
> > > > > > > >> > >> delayed.
> > > > > > > >> > >> > >> > Using
> > > > > > > >> > >> > >> > > > > Jay'e
> > > > > > > >> > >> > >> > > > > > > > > example
> > > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > > >> > >> > >> > > > > > > > > > > > > disk reads for fetches
> > > happening
> > > > > in
> > > > > > > the
> > > > > > > >> > >> network
> > > > > > > >> > >> > >> > thread,
> > > > > > > >> > >> > >> > > > We
> > > > > > > >> > >> > >> > > > > > > can't
> > > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > > >> > >> > >> > > > > > > > > > > > > delay a response after the
> > > disk
> > > > > > reads.
> > > > > > > >> We
> > > > > > > >> > >> could
> > > > > > > >> > >> > >> > record
> > > > > > > >> > >> > >> > > > the
> > > > > > > >> > >> > >> > > > > > time
> > > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > > >> > >> > >> > > > > > > > > > > > > the network thread when
> the
> > > > > response
> > > > > > > is
> > > > > > > >> > >> complete
> > > > > > > >> > >> > >> and
> > > > > > > >> > >> > >> > > > > > introduce
> > > > > > > >> > >> > >> > > > > > > a
> > > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > > >> > >> > >> > > > > > > > > > > > > handling a subsequent
> > request
> > > > > > > (separate
> > > > > > > >> out
> > > > > > > >> > >> > >> recording
> > > > > > > >> > >> > >> > > and
> > > > > > > >> > >> > >> > > > > > quota
> > > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > > >> > >> > >> > > > > > > > > > > > > handling in the case of
> > > network
> > > > > > thread
> > > > > > > >> > >> > overload).
> > > > > > > >> > >> > >> > Does
> > > > > > > >> > >> > >> > > > that
> > > > > > > >> > >> > >> > > > > > > make
> > > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at
> 2:58
> > > AM,
> > > > > > > Becket
> > > > > > > >> > Qin <
> > > > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that
> > enforcing
> > > > the
> > > > > > CPU
> > > > > > > >> time
> > > > > > > >> > >> is a
> > > > > > > >> > >> > >> > little
> > > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > > >> > >> > >> > > > > > > > > am
> > > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > > >> > >> > >> > > > > > > > > > > > > > that maybe we can use
> the
> > > > > existing
> > > > > > > >> > request
> > > > > > > >> > >> > >> > > statistics.
> > > > > > > >> > >> > >> > > > > They
> > > > > > > >> > >> > >> > > > > > > are
> > > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > > >> > >> > >> > > > > > > > > > > > > > very detailed so we can
> > > > probably
> > > > > > see
> > > > > > > >> the
> > > > > > > >> > >> > >> > approximate
> > > > > > > >> > >> > >> > > > CPU
> > > > > > > >> > >> > >> > > > > > time
> > > > > > > >> > >> > >> > > > > > > > > from
> > > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > > >> > >> > >> > > > > > > > > > > > > > something like
> > (total_time -
> > > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > > >> > >> > >> > > > > -
> > > > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > > I agree with Guozhang
> that
> > > > when
> > > > > a
> > > > > > > >> user is
> > > > > > > >> > >> > >> throttled
> > > > > > > >> > >> > >> > > it
> > > > > > > >> > >> > >> > > > is
> > > > > > > >> > >> > >> > > > > > > > likely
> > > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > > >> > >> > >> > > > > > > > > > > > > > need to see if anything
> > has
> > > > went
> > > > > > > wrong
> > > > > > > >> > >> first,
> > > > > > > >> > >> > >> and
> > > > > > > >> > >> > >> > if
> > > > > > > >> > >> > >> > > > the
> > > > > > > >> > >> > >> > > > > > > users
> > > > > > > >> > >> > >> > > > > > > > > are
> > > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > > >> > >> > >> > > > > > > > > > > > > > behaving and just need
> > more
> > > > > > > >> resources, we
> > > > > > > >> > >> will
> > > > > > > >> > >> > >> have
> > > > > > > >> > >> > >> > > to
> > > > > > > >> > >> > >> > > > > bump
> > > > > > > >> > >> > >> > > > > > > up
> > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > > >> > >> > >> > > > > > > > > > > > > > for them. It is true
> that
> > > > > > > >> pre-allocating
> > > > > > > >> > >> CPU
> > > > > > > >> > >> > >> time
> > > > > > > >> > >> > >> > > quota
> > > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > > > > users is difficult. So
> in
> > > > > practice
> > > > > > > it
> > > > > > > >> > would
> > > > > > > >> > >> > >> > probably
> > > > > > > >> > >> > >> > > be
> > > > > > > >> > >> > >> > > > > > more
> > > > > > > >> > >> > >> > > > > > > > like
> > > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > > >> > >> > >> > > > > > > > > > > > > > a relative high
> protective
> > > CPU
> > > > > > time
> > > > > > > >> quota
> > > > > > > >> > >> for
> > > > > > > >> > >> > >> > > everyone
> > > > > > > >> > >> > >> > > > > and
> > > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > > >> > >> > >> > > > > > > > > > > > > > for some individual
> > clients
> > > on
> > > > > > > demand.
> > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at
> > 5:48
> > > > PM,
> > > > > > > >> Guozhang
> > > > > > > >> > >> > Wang <
> > > > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > This is a great
> > proposal,
> > > > glad
> > > > > > to
> > > > > > > >> see
> > > > > > > >> > it
> > > > > > > >> > >> > >> > happening.
> > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > I am inclined to the
> CPU
> > > > > > > >> throttling, or
> > > > > > > >> > >> more
> > > > > > > >> > >> > >> > > > > specifically
> > > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio instead of the
> > > request
> > > > > > rate
> > > > > > > >> > >> throttling
> > > > > > > >> > >> > >> as
> > > > > > > >> > >> > >> > > well.
> > > > > > > >> > >> > >> > > > > > > Becket
> > > > > > > >> > >> > >> > > > > > > > > has
> > > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > summed my rationales
> > > above,
> > > > > and
> > > > > > > one
> > > > > > > >> > >> thing to
> > > > > > > >> > >> > >> add
> > > > > > > >> > >> > >> > > here
> > > > > > > >> > >> > >> > > > > is
> > > > > > > >> > >> > >> > > > > > > that
> > > > > > > >> > >> > >> > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > has a good support for
> > > both
> > > > > > > >> "protecting
> > > > > > > >> > >> > >> against
> > > > > > > >> > >> > >> > > rogue
> > > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > > >> > >> > >> > > > > > > > > as
> > > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster
> for
> > > > > > > >> multi-tenancy
> > > > > > > >> > >> > usage":
> > > > > > > >> > >> > >> > when
> > > > > > > >> > >> > >> > > > > > > thinking
> > > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > explain this to the
> end
> > > > > users, I
> > > > > > > >> find
> > > > > > > >> > it
> > > > > > > >> > >> > >> actually
> > > > > > > >> > >> > >> > > > more
> > > > > > > >> > >> > >> > > > > > > > natural
> > > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > request rate since as
> > > > > mentioned
> > > > > > > >> above,
> > > > > > > >> > >> > >> different
> > > > > > > >> > >> > >> > > > > requests
> > > > > > > >> > >> > >> > > > > > > > will
> > > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > different "cost", and
> > > Kafka
> > > > > > today
> > > > > > > >> > already
> > > > > > > >> > >> > have
> > > > > > > >> > >> > >> > > > various
> > > > > > > >> > >> > >> > > > > > > > request
> > > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > (produce, fetch,
> admin,
> > > > > > metadata,
> > > > > > > >> etc),
> > > > > > > >> > >> > >> because
> > > > > > > >> > >> > >> > of
> > > > > > > >> > >> > >> > > > that
> > > > > > > >> > >> > >> > > > > > the
> > > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttling may not be
> as
> > > > > > effective
> > > > > > > >> > >> unless it
> > > > > > > >> > >> > >> is
> > > > > > > >> > >> > >> > set
> > > > > > > >> > >> > >> > > > > very
> > > > > > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > Regarding to user
> > > reactions
> > > > > when
> > > > > > > >> they
> > > > > > > >> > are
> > > > > > > >> > >> > >> > > throttled,
> > > > > > > >> > >> > >> > > > I
> > > > > > > >> > >> > >> > > > > > > think
> > > > > > > >> > >> > >> > > > > > > > it
> > > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > case-by-case, and need
> > to
> > > be
> > > > > > > >> > discovered /
> > > > > > > >> > >> > >> guided
> > > > > > > >> > >> > >> > by
> > > > > > > >> > >> > >> > > > > > looking
> > > > > > > >> > >> > >> > > > > > > > at
> > > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > metrics. So in other
> > words
> > > > > users
> > > > > > > >> would
> > > > > > > >> > >> not
> > > > > > > >> > >> > >> expect
> > > > > > > >> > >> > >> > > to
> > > > > > > >> > >> > >> > > > > get
> > > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > information by simply
> > > being
> > > > > told
> > > > > > > >> "hey,
> > > > > > > >> > >> you
> > > > > > > >> > >> > are
> > > > > > > >> > >> > >> > > > > > throttled",
> > > > > > > >> > >> > >> > > > > > > > > which
> > > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > what throttling does;
> > they
> > > > > need
> > > > > > to
> > > > > > > >> > take a
> > > > > > > >> > >> > >> > follow-up
> > > > > > > >> > >> > >> > > > > step
> > > > > > > >> > >> > >> > > > > > > and
> > > > > > > >> > >> > >> > > > > > > > > see
> > > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > throttled probably
> > because
> > > > of
> > > > > > ..",
> > > > > > > >> > which
> > > > > > > >> > >> is
> > > > > > > >> > >> > by
> > > > > > > >> > >> > >> > > > looking
> > > > > > > >> > >> > >> > > > > at
> > > > > > > >> > >> > >> > > > > > > > other
> > > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > > >> > >> > >> > > > > > > > > > > > > > > values: e.g. whether
> I'm
> > > > > > > bombarding
> > > > > > > >> the
> > > > > > > >> > >> > >> brokers
> > > > > > > >> > >> > >> > > with
> > > > > > > >> > >> > >> > > > >
> > > > > > > >>
> > > > > > > > ...
> > > > > > > >
> > > > > > > > [Message clipped]
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Becket Qin <be...@gmail.com>.
Hi Rajini/Jun,

The percentage based reasoning sounds good.
One thing I am wondering is that if we assume the network thread are just
doing the network IO, can we say bytes rate quota is already sort of
network threads quota?
If we take network threads into the consideration here, would that be
somewhat overlapping with the bytes rate quota?

Thanks,

Jiangjie (Becket) Qin

On Tue, Mar 7, 2017 at 11:04 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Jun,
>
> Thank you for the explanation, I hadn't realized you meant percentage of
> the total thread pool. If everyone is OK with Jun's suggestion, I will
> update the KIP.
>
> Thanks,
>
> Rajini
>
> On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Rajini,
> >
> > Let's take your example. Let's say a user sets the limit to 50%. I am not
> > sure if it's better to apply the same percentage separately to network
> and
> > io thread pool. For example, for produce requests, most of the time will
> be
> > spent in the io threads whereas for fetch requests, most of the time will
> > be in the network threads. So, using the same percentage in both thread
> > pools means one of the pools' resource will be over allocated.
> >
> > An alternative way is to simply model network and io thread pool
> together.
> > If you get 10 io threads and 5 network threads, you get 1500% request
> > processing power. A 50% limit means a total of 750% processing power. We
> > just add up the time a user request spent in either network or io thread.
> > If that total exceeds 750% (doesn't matter whether it's spent more in
> > network or io thread), the request will be throttled. This seems more
> > general and is not sensitive to the current implementation detail of
> having
> > a separate network and io thread pool. In the future, if the threading
> > model changes, the same concept of quota can still be applied. For now,
> > since it's a bit tricky to add the delay logic in the network thread
> pool,
> > we could probably just do the delaying only in the io threads as you
> > suggested earlier.
> >
> > There is still the orthogonal question of whether a quota of 50% is out
> of
> > 100% or 100% * #total processing threads. My feeling is that the latter
> is
> > slightly better based on my explanation earlier. The way to describe this
> > quota to the users can be "share of elapsed request processing time on a
> > single CPU" (similar to top).
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <ra...@gmail.com>
> > wrote:
> >
> > > Jun,
> > >
> > > Agree about the two scenarios.
> > >
> > > But still not sure about a single quota covering both network threads
> and
> > > I/O threads with per-thread quota. If there are 10 I/O threads and 5
> > > network threads and I want to assign half the quota to userA, the quota
> > > would be 750%. I imagine, internally, we would convert this to 500% for
> > I/O
> > > and 250% for network threads to allocate 50% of each pool.
> > >
> > > A couple of scenarios:
> > >
> > > 1. Admin adds 1 extra network thread. To retain 50%, admin needs to now
> > > allocate 800% for each user. Or increase the quota for a few users. To
> > me,
> > > it feels like admin needs to convert 50% to 800% and Kafka internally
> > needs
> > > to convert 800% to (500%, 300%). Everyone using just 50% feels a lot
> > > simpler.
> > >
> > > 2. We decide to add some other thread to this list. Admin needs to know
> > > exactly how many threads form the maximum quota. And we can be changing
> > > this between broker versions as we add more to the list. Again a single
> > > overall percent would be a lot simpler.
> > >
> > > There were others who were unconvinced by a single percent from the
> > initial
> > > proposal and were happier with thread units similar to CPU units, so I
> am
> > > ok with going with per-thread quotas (as units or percent). Just not
> sure
> > > it makes it easier for admin in all cases.
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Rajini,
> > > >
> > > > Consider modeling as n * 100% unit. For 2), the question is what's
> > > causing
> > > > the I/O threads to be saturated. It's unlikely that all users'
> > > utilization
> > > > have increased at the same. A more likely case is that a few isolated
> > > > users' utilization have increased. If so, after increasing the number
> > of
> > > > threads, the admin just needs to adjust the quota for a few isolated
> > > users,
> > > > which is expected and is less work.
> > > >
> > > > Consider modeling as 1 * 100% unit. For 1), all users' quota need to
> be
> > > > adjusted, which is unexpected and is more work.
> > > >
> > > > So, to me, the n * 100% model seems more convenient.
> > > >
> > > > As for future extension to cover network thread utilization, I was
> > > thinking
> > > > that one way is to simply model the capacity as (n + m) * 100% unit,
> > > where
> > > > n and m are the number of network and i/o threads, respectively.
> Then,
> > > for
> > > > each user, we can just add up the utilization in the network and the
> > i/o
> > > > thread. If we do this, we don't need a new type of quota.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> > rajinisivaram@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Jun,
> > > > >
> > > > > If we use request.percentage as the percentage used in a single I/O
> > > > thread,
> > > > > the total percentage being allocated will be num.io.threads * 100
> for
> > > I/O
> > > > > threads and num.network.threads * 100 for network threads. A single
> > > quota
> > > > > covering the two as a percentage wouldn't quite work if you want to
> > > > > allocate the same proportion in both cases. If we want to treat
> > threads
> > > > as
> > > > > separate units, won't we need two quota configurations regardless
> of
> > > > > whether we use units or percentage? Perhaps I misunderstood your
> > > > > suggestion.
> > > > >
> > > > > I think there are two cases:
> > > > >
> > > > >    1. The use case that you mentioned where an admin is adding more
> > > users
> > > > >    and decides to add more I/O threads and expects to find free
> quota
> > > to
> > > > >    allocate for new users.
> > > > >    2. Admin adds more I/O threads because the I/O threads are
> > saturated
> > > > and
> > > > >    there are cores available to allocate, even though the number or
> > > > >    users/clients hasn't changed.
> > > > >
> > > > > If we allocated treated I/O threads as a single unit of 100%, all
> > user
> > > > > quotas need to be reallocated for 1). If we allocated I/O threads
> as
> > n
> > > > > units with n*100%, all user quotas need to be reallocated for 2),
> > > > otherwise
> > > > > some of the new threads may just not be used. Either way it should
> be
> > > > easy
> > > > > to write a script to decrease/increase quotas by a multiple for all
> > > > users.
> > > > >
> > > > > So it really boils down to which quota unit is most intuitive in
> > terms
> > > of
> > > > > configuration. And from the discussion so far, it feels like
> opinion
> > is
> > > > > divided on whether quotas should be carved out of an absolute 100%
> > (or
> > > 1
> > > > > unit) or be relative to the number of threads (n*100% or n units).
> > > > >
> > > > >
> > > > >
> > > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io> wrote:
> > > > >
> > > > > > Another way to express an absolute limit is to use
> > > request.percentage,
> > > > > but
> > > > > > treat it as the percentage used in a single request handling
> > thread.
> > > > For
> > > > > > now, the request handling threads can be just the io threads. In
> > the
> > > > > > future, they can cover the network threads as well. This is
> similar
> > > to
> > > > > how
> > > > > > top reports CPU usage and may be a bit easier for people to
> > > understand.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > > > >
> > > > > > > Hi, Jay,
> > > > > > >
> > > > > > > 2. Regarding request.unit vs request.percentage. I started with
> > > > > > > request.percentage too. The reasoning for request.unit is the
> > > > > following.
> > > > > > > Suppose that the capacity has been reached on a broker and the
> > > admin
> > > > > > needs
> > > > > > > to add a new user. A simple way to increase the capacity is to
> > > > increase
> > > > > > the
> > > > > > > number of io threads, assuming there are still enough cores. If
> > the
> > > > > limit
> > > > > > > is based on percentage, the additional capacity automatically
> > gets
> > > > > > > distributed to existing users and we haven't really carved out
> > any
> > > > > > > additional resource for the new user. Now, is it easy for a
> user
> > to
> > > > > > reason
> > > > > > > about 0.1 unit vs 10%. My feeling is that both are hard and
> have
> > to
> > > > be
> > > > > > > configured empirically. Not sure if percentage is obviously
> > easier
> > > to
> > > > > > > reason about.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io>
> > > wrote:
> > > > > > >
> > > > > > >> A couple of quick points:
> > > > > > >>
> > > > > > >> 1. Even though the implementation of this quota is only using
> io
> > > > > thread
> > > > > > >> time, i think we should call it something like "request-time".
> > > This
> > > > > will
> > > > > > >> give us flexibility to improve the implementation to cover
> > network
> > > > > > threads
> > > > > > >> in the future and will avoid exposing internal details like
> our
> > > > thread
> > > > > > >> pools on the server.
> > > > > > >>
> > > > > > >> 2. Jun/Roger, I get what you are trying to fix but the idea of
> > > > > > >> thread/units
> > > > > > >> is super unintuitive as a user-facing knob. I had to read the
> > KIP
> > > > like
> > > > > > >> eight times to understand this. I'm not sure that your point
> > that
> > > > > > >> increasing the number of threads is a problem with a
> > > > percentage-based
> > > > > > >> value, it really depends on whether the user thinks about the
> > > > > > "percentage
> > > > > > >> of request processing time" or "thread units". If they think
> "I
> > > have
> > > > > > >> allocated 10% of my request processing time to user x" then it
> > is
> > > a
> > > > > bug
> > > > > > >> that increasing the thread count decreases that percent as it
> > does
> > > > in
> > > > > > the
> > > > > > >> current proposal. As a practical matter I think the only way
> to
> > > > > actually
> > > > > > >> reason about this is as a percent---I just don't believe
> people
> > > are
> > > > > > going
> > > > > > >> to think, "ah, 4.3 thread units, that is the right amount!".
> > > > Instead I
> > > > > > >> think they have to understand this thread unit concept, figure
> > out
> > > > > what
> > > > > > >> they have set in number of threads, compute a percent and then
> > > come
> > > > up
> > > > > > >> with
> > > > > > >> the number of thread units, and these will all be wrong if
> that
> > > > thread
> > > > > > >> count changes. I also think this ties us to throttling the I/O
> > > > thread
> > > > > > >> pool,
> > > > > > >> which may not be where we want to end up.
> > > > > > >>
> > > > > > >> 3. For what it's worth I do think having a single throttle_ms
> > > field
> > > > in
> > > > > > all
> > > > > > >> the responses that combines all throttling from all quotas is
> > > > probably
> > > > > > the
> > > > > > >> simplest. There could be a use case for having separate fields
> > for
> > > > > each,
> > > > > > >> but I think that is actually harder to use/monitor in the
> common
> > > > case
> > > > > so
> > > > > > >> unless someone has a use case I think just one should be fine.
> > > > > > >>
> > > > > > >> -Jay
> > > > > > >>
> > > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > > > > rajinisivaram@gmail.com>
> > > > > > >> wrote:
> > > > > > >>
> > > > > > >> > I have updated the KIP based on the discussions so far.
> > > > > > >> >
> > > > > > >> >
> > > > > > >> > Regards,
> > > > > > >> >
> > > > > > >> > Rajini
> > > > > > >> >
> > > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > > > > >> rajinisivaram@gmail.com>
> > > > > > >> > wrote:
> > > > > > >> >
> > > > > > >> > > Thank you all for the feedback.
> > > > > > >> > >
> > > > > > >> > > Ismael #1. It makes sense not to throttle inter-broker
> > > requests
> > > > > like
> > > > > > >> > > LeaderAndIsr etc. The simplest way to ensure that clients
> > > cannot
> > > > > use
> > > > > > >> > these
> > > > > > >> > > requests to bypass quotas for DoS attacks is to ensure
> that
> > > ACLs
> > > > > > >> prevent
> > > > > > >> > > clients from using these requests and unauthorized
> requests
> > > are
> > > > > > >> included
> > > > > > >> > > towards quotas.
> > > > > > >> > >
> > > > > > >> > > Ismael #2, Jay #1 : I was thinking that these quotas can
> > > return
> > > > a
> > > > > > >> > separate
> > > > > > >> > > throttle time, and all utilization based quotas could use
> > the
> > > > same
> > > > > > >> field
> > > > > > >> > > (we won't add another one for network thread utilization
> for
> > > > > > >> instance).
> > > > > > >> > But
> > > > > > >> > > perhaps it makes sense to keep byte rate quotas separate
> in
> > > > > > >> produce/fetch
> > > > > > >> > > responses to provide separate metrics? Agree with Ismael
> > that
> > > > the
> > > > > > >> name of
> > > > > > >> > > the existing field should be changed if we have two. Happy
> > to
> > > > > switch
> > > > > > >> to a
> > > > > > >> > > single combined throttle time if that is sufficient.
> > > > > > >> > >
> > > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot separated
> > > name
> > > > > for
> > > > > > >> new
> > > > > > >> > > property. Replication quotas use dot separated, so it will
> > be
> > > > > > >> consistent
> > > > > > >> > > with all properties except byte rate quotas.
> > > > > > >> > >
> > > > > > >> > > Radai: #1 Request processing time rather than request rate
> > > were
> > > > > > chosen
> > > > > > >> > > because the time per request can vary significantly
> between
> > > > > requests
> > > > > > >> as
> > > > > > >> > > mentioned in the discussion and KIP.
> > > > > > >> > > #2 Two separate quotas for heartbeats/regular requests
> feel
> > > like
> > > > > > more
> > > > > > >> > > configuration and more metrics. Since most users would set
> > > > quotas
> > > > > > >> higher
> > > > > > >> > > than the expected usage and quotas are more of a safety
> > net, a
> > > > > > single
> > > > > > >> > quota
> > > > > > >> > > should work in most cases.
> > > > > > >> > >  #3 The number of requests in purgatory is limited by the
> > > number
> > > > > of
> > > > > > >> > active
> > > > > > >> > > connections since only one request per connection will be
> > > > > throttled
> > > > > > >> at a
> > > > > > >> > > time.
> > > > > > >> > > #4 As with byte rate quotas, to use the full allocated
> > quotas,
> > > > > > >> > > clients/users would need to use partitions that are
> > > distributed
> > > > > > across
> > > > > > >> > the
> > > > > > >> > > cluster. The alternative of using cluster-wide quotas
> > instead
> > > of
> > > > > > >> > per-broker
> > > > > > >> > > quotas would be far too complex to implement.
> > > > > > >> > >
> > > > > > >> > > Dong : We currently have two ClientQuotaManagers for quota
> > > types
> > > > > > Fetch
> > > > > > >> > and
> > > > > > >> > > Produce. A new one will be added for IOThread, which
> manages
> > > > > quotas
> > > > > > >> for
> > > > > > >> > I/O
> > > > > > >> > > thread utilization. This will not update the Fetch or
> > Produce
> > > > > > >> queue-size,
> > > > > > >> > > but will have a separate metric for the queue-size.  I
> > wasn't
> > > > > > >> planning to
> > > > > > >> > > add any additional metrics apart from the equivalent ones
> > for
> > > > > > existing
> > > > > > >> > > quotas as part of this KIP. Ratio of byte-rate to I/O
> thread
> > > > > > >> utilization
> > > > > > >> > > could be slightly misleading since it depends on the
> > sequence
> > > of
> > > > > > >> > requests.
> > > > > > >> > > But we can look into more metrics after the KIP is
> > implemented
> > > > if
> > > > > > >> > required.
> > > > > > >> > >
> > > > > > >> > > I think we need to limit the maximum delay since all
> > requests
> > > > are
> > > > > > >> > > throttled. If a client has a quota of 0.001 units and a
> > single
> > > > > > request
> > > > > > >> > used
> > > > > > >> > > 50ms, we don't want to delay all requests from the client
> by
> > > 50
> > > > > > >> seconds,
> > > > > > >> > > throwing the client out of all its consumer groups. The
> > issue
> > > is
> > > > > > only
> > > > > > >> if
> > > > > > >> > a
> > > > > > >> > > user is allocated a quota that is insufficient to process
> > one
> > > > > large
> > > > > > >> > > request. The expectation is that the units allocated per
> > user
> > > > will
> > > > > > be
> > > > > > >> > much
> > > > > > >> > > higher than the time taken to process one request and the
> > > limit
> > > > > > should
> > > > > > >> > > seldom be applied. Agree this needs proper documentation.
> > > > > > >> > >
> > > > > > >> > > Regards,
> > > > > > >> > >
> > > > > > >> > > Rajini
> > > > > > >> > >
> > > > > > >> > >
> > > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > > radai.rosenblatt@gmail.com>
> > > > > > >> > wrote:
> > > > > > >> > >
> > > > > > >> > >> @jun: i wasnt concerned about tying up a request
> processing
> > > > > thread,
> > > > > > >> but
> > > > > > >> > >> IIUC the code does still read the entire request out,
> which
> > > > might
> > > > > > >> add-up
> > > > > > >> > >> to
> > > > > > >> > >> a non-negligible amount of memory.
> > > > > > >> > >>
> > > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> > > > lindong28@gmail.com>
> > > > > > >> wrote:
> > > > > > >> > >>
> > > > > > >> > >> > Hey Rajini,
> > > > > > >> > >> >
> > > > > > >> > >> > The current KIP says that the maximum delay will be
> > reduced
> > > > to
> > > > > > >> window
> > > > > > >> > >> size
> > > > > > >> > >> > if it is larger than the window size. I have a concern
> > with
> > > > > this:
> > > > > > >> > >> >
> > > > > > >> > >> > 1) This essentially means that the user is allowed to
> > > exceed
> > > > > > their
> > > > > > >> > quota
> > > > > > >> > >> > over a long period of time. Can you provide an upper
> > bound
> > > on
> > > > > > this
> > > > > > >> > >> > deviation?
> > > > > > >> > >> >
> > > > > > >> > >> > 2) What is the motivation for cap the maximum delay by
> > the
> > > > > window
> > > > > > >> > size?
> > > > > > >> > >> I
> > > > > > >> > >> > am wondering if there is better alternative to address
> > the
> > > > > > problem.
> > > > > > >> > >> >
> > > > > > >> > >> > 3) It means that the existing metric-related config
> will
> > > > have a
> > > > > > >> more
> > > > > > >> > >> > directly impact on the mechanism of this
> > > io-thread-unit-based
> > > > > > >> quota.
> > > > > > >> > The
> > > > > > >> > >> > may be an important change depending on the answer to
> 1)
> > > > above.
> > > > > > We
> > > > > > >> > >> probably
> > > > > > >> > >> > need to document this more explicitly.
> > > > > > >> > >> >
> > > > > > >> > >> > Dong
> > > > > > >> > >> >
> > > > > > >> > >> >
> > > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> > > > > lindong28@gmail.com>
> > > > > > >> > wrote:
> > > > > > >> > >> >
> > > > > > >> > >> > > Hey Jun,
> > > > > > >> > >> > >
> > > > > > >> > >> > > Yeah you are right. I thought it wasn't because at
> > > LinkedIn
> > > > > it
> > > > > > >> will
> > > > > > >> > be
> > > > > > >> > >> > too
> > > > > > >> > >> > > much pressure on inGraph to expose those per-clientId
> > > > metrics
> > > > > > so
> > > > > > >> we
> > > > > > >> > >> ended
> > > > > > >> > >> > > up printing them periodically to local log. Never
> mind
> > if
> > > > it
> > > > > is
> > > > > > >> not
> > > > > > >> > a
> > > > > > >> > >> > > general problem.
> > > > > > >> > >> > >
> > > > > > >> > >> > > Hey Rajini,
> > > > > > >> > >> > >
> > > > > > >> > >> > > - I agree with Jay that we probably don't want to
> add a
> > > new
> > > > > > field
> > > > > > >> > for
> > > > > > >> > >> > > every quota ProduceResponse or FetchResponse. Is
> there
> > > any
> > > > > > >> use-case
> > > > > > >> > >> for
> > > > > > >> > >> > > having separate throttle-time fields for
> > byte-rate-quota
> > > > and
> > > > > > >> > >> > > io-thread-unit-quota? You probably need to document
> > this
> > > as
> > > > > > >> > interface
> > > > > > >> > >> > > change if you plan to add new field in any request.
> > > > > > >> > >> > >
> > > > > > >> > >> > > - I don't think IOThread belongs to quotaType. The
> > > existing
> > > > > > quota
> > > > > > >> > >> types
> > > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > > n/FollowerReplication)
> > > > > > >> identify
> > > > > > >> > >> the
> > > > > > >> > >> > > type of request that are throttled, not the quota
> > > mechanism
> > > > > > that
> > > > > > >> is
> > > > > > >> > >> > applied.
> > > > > > >> > >> > >
> > > > > > >> > >> > > - If a request is throttled due to this
> > > > io-thread-unit-based
> > > > > > >> quota,
> > > > > > >> > is
> > > > > > >> > >> > the
> > > > > > >> > >> > > existing queue-size metric in ClientQuotaManager
> > > > incremented?
> > > > > > >> > >> > >
> > > > > > >> > >> > > - In the interest of providing guide line for admin
> to
> > > > decide
> > > > > > >> > >> > > io-thread-unit-based quota and for user to understand
> > its
> > > > > > impact
> > > > > > >> on
> > > > > > >> > >> their
> > > > > > >> > >> > > traffic, would it be useful to have a metric that
> shows
> > > the
> > > > > > >> overall
> > > > > > >> > >> > > byte-rate per io-thread-unit? Can we also show this a
> > > > > > >> per-clientId
> > > > > > >> > >> > metric?
> > > > > > >> > >> > >
> > > > > > >> > >> > > Thanks,
> > > > > > >> > >> > > Dong
> > > > > > >> > >> > >
> > > > > > >> > >> > >
> > > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <
> > > jun@confluent.io
> > > > >
> > > > > > >> wrote:
> > > > > > >> > >> > >
> > > > > > >> > >> > >> Hi, Ismael,
> > > > > > >> > >> > >>
> > > > > > >> > >> > >> For #3, typically, an admin won't configure more io
> > > > threads
> > > > > > than
> > > > > > >> > CPU
> > > > > > >> > >> > >> cores,
> > > > > > >> > >> > >> but it's possible for an admin to start with fewer
> io
> > > > > threads
> > > > > > >> than
> > > > > > >> > >> cores
> > > > > > >> > >> > >> and grow that later on.
> > > > > > >> > >> > >>
> > > > > > >> > >> > >> Hi, Dong,
> > > > > > >> > >> > >>
> > > > > > >> > >> > >> I think the throttleTime sensor on the broker tells
> > the
> > > > > admin
> > > > > > >> > >> whether a
> > > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > > >> > >> > >>
> > > > > > >> > >> > >> Hi, Radi,
> > > > > > >> > >> > >>
> > > > > > >> > >> > >> The reasoning for delaying the throttled requests on
> > the
> > > > > > broker
> > > > > > >> > >> instead
> > > > > > >> > >> > of
> > > > > > >> > >> > >> returning an error immediately is that the latter
> has
> > no
> > > > way
> > > > > > to
> > > > > > >> > >> prevent
> > > > > > >> > >> > >> the
> > > > > > >> > >> > >> client from retrying immediately, which will make
> > things
> > > > > > worse.
> > > > > > >> The
> > > > > > >> > >> > >> delaying logic is based off a delay queue. A
> separate
> > > > > > expiration
> > > > > > >> > >> thread
> > > > > > >> > >> > >> just waits on the next to be expired request. So, it
> > > > doesn't
> > > > > > tie
> > > > > > >> > up a
> > > > > > >> > >> > >> request handler thread.
> > > > > > >> > >> > >>
> > > > > > >> > >> > >> Thanks,
> > > > > > >> > >> > >>
> > > > > > >> > >> > >> Jun
> > > > > > >> > >> > >>
> > > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> > > > > > ismael@juma.me.uk
> > > > > > >> >
> > > > > > >> > >> wrote:
> > > > > > >> > >> > >>
> > > > > > >> > >> > >> > Hi Jay,
> > > > > > >> > >> > >> >
> > > > > > >> > >> > >> > Regarding 1, I definitely like the simplicity of
> > > > keeping a
> > > > > > >> single
> > > > > > >> > >> > >> throttle
> > > > > > >> > >> > >> > time field in the response. The downside is that
> the
> > > > > client
> > > > > > >> > metrics
> > > > > > >> > >> > >> will be
> > > > > > >> > >> > >> > more coarse grained.
> > > > > > >> > >> > >> >
> > > > > > >> > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.
> > > > > > percentage`
> > > > > > >> > and
> > > > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > > >> > >> > >> >
> > > > > > >> > >> > >> > Ismael
> > > > > > >> > >> > >> >
> > > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> > > > > > jay@confluent.io>
> > > > > > >> > >> wrote:
> > > > > > >> > >> > >> >
> > > > > > >> > >> > >> > > A few minor comments:
> > > > > > >> > >> > >> > >
> > > > > > >> > >> > >> > >    1. Isn't it the case that the throttling time
> > > > > response
> > > > > > >> field
> > > > > > >> > >> > should
> > > > > > >> > >> > >> > have
> > > > > > >> > >> > >> > >    the total time your request was throttled
> > > > > irrespective
> > > > > > of
> > > > > > >> > the
> > > > > > >> > >> > >> quotas
> > > > > > >> > >> > >> > > that
> > > > > > >> > >> > >> > >    caused that. Limiting it to byte rate quota
> > > doesn't
> > > > > > make
> > > > > > >> > >> sense,
> > > > > > >> > >> > >> but I
> > > > > > >> > >> > >> > > also
> > > > > > >> > >> > >> > >    I don't think we want to end up adding new
> > fields
> > > > in
> > > > > > the
> > > > > > >> > >> response
> > > > > > >> > >> > >> for
> > > > > > >> > >> > >> > > every
> > > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > > >> > >> > >> > >    2. I don't think we should make this quota
> > > > > specifically
> > > > > > >> > about
> > > > > > >> > >> io
> > > > > > >> > >> > >> > >    threads. Once we introduce these quotas
> people
> > > set
> > > > > them
> > > > > > >> and
> > > > > > >> > >> > expect
> > > > > > >> > >> > >> > them
> > > > > > >> > >> > >> > > to
> > > > > > >> > >> > >> > >    be enforced (and if they aren't it may cause
> an
> > > > > > outage).
> > > > > > >> As
> > > > > > >> > a
> > > > > > >> > >> > >> result
> > > > > > >> > >> > >> > > they
> > > > > > >> > >> > >> > >    are a bit more sensitive than normal
> configs, I
> > > > > think.
> > > > > > >> The
> > > > > > >> > >> > current
> > > > > > >> > >> > >> > > thread
> > > > > > >> > >> > >> > >    pools seem like something of an
> implementation
> > > > detail
> > > > > > and
> > > > > > >> > not
> > > > > > >> > >> the
> > > > > > >> > >> > >> > level
> > > > > > >> > >> > >> > > the
> > > > > > >> > >> > >> > >    user-facing quotas should be involved with. I
> > > think
> > > > > it
> > > > > > >> might
> > > > > > >> > >> be
> > > > > > >> > >> > >> better
> > > > > > >> > >> > >> > > to
> > > > > > >> > >> > >> > >    make this a general request-time throttle
> with
> > no
> > > > > > >> mention in
> > > > > > >> > >> the
> > > > > > >> > >> > >> > naming
> > > > > > >> > >> > >> > >    about I/O threads and simply acknowledge the
> > > > current
> > > > > > >> > >> limitation
> > > > > > >> > >> > >> (which
> > > > > > >> > >> > >> > > we
> > > > > > >> > >> > >> > >    may someday fix) in the docs that this covers
> > > only
> > > > > the
> > > > > > >> time
> > > > > > >> > >> after
> > > > > > >> > >> > >> the
> > > > > > >> > >> > >> > >    thread is read off the network.
> > > > > > >> > >> > >> > >    3. As such I think the right interface to the
> > > user
> > > > > > would
> > > > > > >> be
> > > > > > >> > >> > >> something
> > > > > > >> > >> > >> > >    like percent_request_time and be in
> {0,...100}
> > or
> > > > > > >> > >> > >> request_time_ratio
> > > > > > >> > >> > >> > > and be
> > > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the
> > > > terminology
> > > > > we
> > > > > > >> used
> > > > > > >> > >> if
> > > > > > >> > >> > the
> > > > > > >> > >> > >> > > scale
> > > > > > >> > >> > >> > >    is between 0 and 1 in the other metrics,
> > right?)
> > > > > > >> > >> > >> > >
> > > > > > >> > >> > >> > > -Jay
> > > > > > >> > >> > >> > >
> > > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram
> <
> > > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > > >> > >> > >> > >
> > > > > > >> > >> > >> > > wrote:
> > > > > > >> > >> > >> > >
> > > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > > >> > >> > >> > > >
> > > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > > >> > >> > >> > > >
> > > > > > >> > >> > >> > > > Guozhang : I have updated the section on
> > > > co-existence
> > > > > of
> > > > > > >> byte
> > > > > > >> > >> rate
> > > > > > >> > >> > >> and
> > > > > > >> > >> > >> > > > request time quotas.
> > > > > > >> > >> > >> > > >
> > > > > > >> > >> > >> > > > Dong: I hadn't added much detail to the
> metrics
> > > and
> > > > > > >> sensors
> > > > > > >> > >> since
> > > > > > >> > >> > >> they
> > > > > > >> > >> > >> > > are
> > > > > > >> > >> > >> > > > going to be very similar to the existing
> metrics
> > > and
> > > > > > >> sensors.
> > > > > > >> > >> To
> > > > > > >> > >> > >> avoid
> > > > > > >> > >> > >> > > > confusion, I have now added more detail. All
> > > metrics
> > > > > are
> > > > > > >> in
> > > > > > >> > the
> > > > > > >> > >> > >> group
> > > > > > >> > >> > >> > > > "quotaType" and all sensors have names
> starting
> > > with
> > > > > > >> > >> "quotaType"
> > > > > > >> > >> > >> (where
> > > > > > >> > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > > > > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > > > > >> > >> > >> > > > So there will be no reuse of existing
> > > > metrics/sensors.
> > > > > > The
> > > > > > >> > new
> > > > > > >> > >> > ones
> > > > > > >> > >> > >> for
> > > > > > >> > >> > >> > > > request processing time based throttling will
> be
> > > > > > >> completely
> > > > > > >> > >> > >> independent
> > > > > > >> > >> > >> > > of
> > > > > > >> > >> > >> > > > existing metrics/sensors, but will be
> consistent
> > > in
> > > > > > >> format.
> > > > > > >> > >> > >> > > >
> > > > > > >> > >> > >> > > > The existing throttle_time_ms field in
> > > produce/fetch
> > > > > > >> > responses
> > > > > > >> > >> > will
> > > > > > >> > >> > >> not
> > > > > > >> > >> > >> > > be
> > > > > > >> > >> > >> > > > impacted by this KIP. That will continue to
> > return
> > > > > > >> byte-rate
> > > > > > >> > >> based
> > > > > > >> > >> > >> > > > throttling times. In addition, a new field
> > > > > > >> > >> > request_throttle_time_ms
> > > > > > >> > >> > >> > will
> > > > > > >> > >> > >> > > be
> > > > > > >> > >> > >> > > > added to return request quota based throttling
> > > > times.
> > > > > > >> These
> > > > > > >> > >> will
> > > > > > >> > >> > be
> > > > > > >> > >> > >> > > exposed
> > > > > > >> > >> > >> > > > as new metrics on the client-side.
> > > > > > >> > >> > >> > > >
> > > > > > >> > >> > >> > > > Since all metrics and sensors are different
> for
> > > each
> > > > > > type
> > > > > > >> of
> > > > > > >> > >> > quota,
> > > > > > >> > >> > >> I
> > > > > > >> > >> > >> > > > believe there is already sufficient metrics to
> > > > monitor
> > > > > > >> > >> throttling
> > > > > > >> > >> > on
> > > > > > >> > >> > >> > both
> > > > > > >> > >> > >> > > > client and broker side for each type of
> > > throttling.
> > > > > > >> > >> > >> > > >
> > > > > > >> > >> > >> > > > Regards,
> > > > > > >> > >> > >> > > >
> > > > > > >> > >> > >> > > > Rajini
> > > > > > >> > >> > >> > > >
> > > > > > >> > >> > >> > > >
> > > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > > > > > >> > lindong28@gmail.com
> > > > > > >> > >> >
> > > > > > >> > >> > >> wrote:
> > > > > > >> > >> > >> > > >
> > > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > > I think it makes a lot of sense to use
> > > > > io_thread_units
> > > > > > >> as
> > > > > > >> > >> metric
> > > > > > >> > >> > >> to
> > > > > > >> > >> > >> > > quota
> > > > > > >> > >> > >> > > > > user's traffic here. LGTM overall. I have
> some
> > > > > > questions
> > > > > > >> > >> > regarding
> > > > > > >> > >> > >> > > > sensors.
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > > - Can you be more specific in the KIP what
> > > sensors
> > > > > > will
> > > > > > >> be
> > > > > > >> > >> > added?
> > > > > > >> > >> > >> For
> > > > > > >> > >> > >> > > > > example, it will be useful to specify the
> name
> > > and
> > > > > > >> > >> attributes of
> > > > > > >> > >> > >> > these
> > > > > > >> > >> > >> > > > new
> > > > > > >> > >> > >> > > > > sensors.
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > > - We currently have throttle-time and
> > queue-size
> > > > for
> > > > > > >> > >> byte-rate
> > > > > > >> > >> > >> based
> > > > > > >> > >> > >> > > > quota.
> > > > > > >> > >> > >> > > > > Are you going to have separate throttle-time
> > and
> > > > > > >> queue-size
> > > > > > >> > >> for
> > > > > > >> > >> > >> > > requests
> > > > > > >> > >> > >> > > > > throttled by io_thread_unit-based quota, or
> > will
> > > > > they
> > > > > > >> share
> > > > > > >> > >> the
> > > > > > >> > >> > >> same
> > > > > > >> > >> > >> > > > > sensor?
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > > - Does the throttle-time in the
> > ProduceResponse
> > > > and
> > > > > > >> > >> > FetchResponse
> > > > > > >> > >> > >> > > > contains
> > > > > > >> > >> > >> > > > > time due to io_thread_unit-based quota?
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > > - Currently kafka server doesn't not provide
> > any
> > > > log
> > > > > > or
> > > > > > >> > >> metrics
> > > > > > >> > >> > >> that
> > > > > > >> > >> > >> > > > tells
> > > > > > >> > >> > >> > > > > whether any given clientId (or user) is
> > > throttled.
> > > > > > This
> > > > > > >> is
> > > > > > >> > >> not
> > > > > > >> > >> > too
> > > > > > >> > >> > >> > bad
> > > > > > >> > >> > >> > > > > because we can still check the client-side
> > > > byte-rate
> > > > > > >> metric
> > > > > > >> > >> to
> > > > > > >> > >> > >> > validate
> > > > > > >> > >> > >> > > > > whether a given client is throttled. But
> with
> > > this
> > > > > > >> > >> > io_thread_unit,
> > > > > > >> > >> > >> > > there
> > > > > > >> > >> > >> > > > > will be no way to validate whether a given
> > > client
> > > > is
> > > > > > >> slow
> > > > > > >> > >> > because
> > > > > > >> > >> > >> it
> > > > > > >> > >> > >> > > has
> > > > > > >> > >> > >> > > > > exceeded its io_thread_unit limit. It is
> > > necessary
> > > > > for
> > > > > > >> user
> > > > > > >> > >> to
> > > > > > >> > >> > be
> > > > > > >> > >> > >> > able
> > > > > > >> > >> > >> > > to
> > > > > > >> > >> > >> > > > > know this information to figure how whether
> > they
> > > > > have
> > > > > > >> > reached
> > > > > > >> > >> > >> there
> > > > > > >> > >> > >> > > quota
> > > > > > >> > >> > >> > > > > limit. How about we add log4j log on the
> > server
> > > > side
> > > > > > to
> > > > > > >> > >> > >> periodically
> > > > > > >> > >> > >> > > > print
> > > > > > >> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > > >> > >> > >> > > so
> > > > > > >> > >> > >> > > > > that kafka administrator can figure those
> > users
> > > > that
> > > > > > >> have
> > > > > > >> > >> > reached
> > > > > > >> > >> > >> > their
> > > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > > Thanks,
> > > > > > >> > >> > >> > > > > Dong
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang
> > Wang <
> > > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > > >> > >> > >> > > > wrote:
> > > > > > >> > >> > >> > > > >
> > > > > > >> > >> > >> > > > > > Made a pass over the doc, overall LGTM
> > except
> > > a
> > > > > > minor
> > > > > > >> > >> comment
> > > > > > >> > >> > on
> > > > > > >> > >> > >> > the
> > > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > > >> > >> > >> > > > > >
> > > > > > >> > >> > >> > > > > > Stated as "Request processing time
> > throttling
> > > > will
> > > > > > be
> > > > > > >> > >> applied
> > > > > > >> > >> > on
> > > > > > >> > >> > >> > top
> > > > > > >> > >> > >> > > if
> > > > > > >> > >> > >> > > > > > necessary." I thought that it meant the
> > > request
> > > > > > >> > processing
> > > > > > >> > >> > time
> > > > > > >> > >> > >> > > > > throttling
> > > > > > >> > >> > >> > > > > > is applied first, but continue reading I
> > found
> > > > it
> > > > > > >> > actually
> > > > > > >> > >> > >> meant to
> > > > > > >> > >> > >> > > > apply
> > > > > > >> > >> > >> > > > > > produce / fetch byte rate throttling
> first.
> > > > > > >> > >> > >> > > > > >
> > > > > > >> > >> > >> > > > > > Also the last sentence "The remaining
> delay
> > if
> > > > any
> > > > > > is
> > > > > > >> > >> applied
> > > > > > >> > >> > to
> > > > > > >> > >> > >> > the
> > > > > > >> > >> > >> > > > > > response." is a bit confusing to me. Maybe
> > > > > rewording
> > > > > > >> it a
> > > > > > >> > >> bit?
> > > > > > >> > >> > >> > > > > >
> > > > > > >> > >> > >> > > > > >
> > > > > > >> > >> > >> > > > > > Guozhang
> > > > > > >> > >> > >> > > > > >
> > > > > > >> > >> > >> > > > > >
> > > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> > > > > > >> > jun@confluent.io
> > > > > > >> > >> >
> > > > > > >> > >> > >> wrote:
> > > > > > >> > >> > >> > > > > >
> > > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > > >> > >> > >> > > > > > >
> > > > > > >> > >> > >> > > > > > > Thanks for the updated KIP. The latest
> > > > proposal
> > > > > > >> looks
> > > > > > >> > >> good
> > > > > > >> > >> > to
> > > > > > >> > >> > >> me.
> > > > > > >> > >> > >> > > > > > >
> > > > > > >> > >> > >> > > > > > > Jun
> > > > > > >> > >> > >> > > > > > >
> > > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini
> > > > Sivaram
> > > > > <
> > > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > > >> > >> > >> > > > > > >
> > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > >> > >> > >> > > > > > >
> > > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > > >> > >> > >> > > > > > > >
> > > > > > >> > >> > >> > > > > > > > Thank you for the feedback.
> > > > > > >> > >> > >> > > > > > > >
> > > > > > >> > >> > >> > > > > > > > 1. I have updated the KIP to use
> > absolute
> > > > > units
> > > > > > >> > >> instead of
> > > > > > >> > >> > >> > > > > percentage.
> > > > > > >> > >> > >> > > > > > > The
> > > > > > >> > >> > >> > > > > > > > property is called* io_thread_units*
> to
> > > > align
> > > > > > with
> > > > > > >> > the
> > > > > > >> > >> > >> thread
> > > > > > >> > >> > >> > > count
> > > > > > >> > >> > >> > > > > > > > property *num.io.threads*. When we
> > > implement
> > > > > > >> network
> > > > > > >> > >> > thread
> > > > > > >> > >> > >> > > > > utilization
> > > > > > >> > >> > >> > > > > > > > quotas, we can add another property
> > > > > > >> > >> > *network_thread_units.*
> > > > > > >> > >> > >> > > > > > > >
> > > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is already
> listed
> > > > under
> > > > > > the
> > > > > > >> > >> exempt
> > > > > > >> > >> > >> > > requests.
> > > > > > >> > >> > >> > > > > Jun,
> > > > > > >> > >> > >> > > > > > > did
> > > > > > >> > >> > >> > > > > > > > you mean a different request that
> needs
> > to
> > > > be
> > > > > > >> added?
> > > > > > >> > >> The
> > > > > > >> > >> > >> four
> > > > > > >> > >> > >> > > > > requests
> > > > > > >> > >> > >> > > > > > > > currently exempt in the KIP are
> > > StopReplica,
> > > > > > >> > >> > >> > ControlledShutdown,
> > > > > > >> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These
> > are
> > > > > > >> controlled
> > > > > > >> > >> > using
> > > > > > >> > >> > >> > > > > > ClusterAction
> > > > > > >> > >> > >> > > > > > > > ACL, so it is easy to exclude and only
> > > > > throttle
> > > > > > if
> > > > > > >> > >> > >> > unauthorized.
> > > > > > >> > >> > >> > > I
> > > > > > >> > >> > >> > > > > > wasn't
> > > > > > >> > >> > >> > > > > > > > sure if there are other requests used
> > only
> > > > for
> > > > > > >> > >> > inter-broker
> > > > > > >> > >> > >> > that
> > > > > > >> > >> > >> > > > > needed
> > > > > > >> > >> > >> > > > > > > to
> > > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > > >> > >> > >> > > > > > > >
> > > > > > >> > >> > >> > > > > > > > 3. I was thinking the smallest change
> > > would
> > > > be
> > > > > > to
> > > > > > >> > >> replace
> > > > > > >> > >> > >> all
> > > > > > >> > >> > >> > > > > > references
> > > > > > >> > >> > >> > > > > > > to
> > > > > > >> > >> > >> > > > > > > > *requestChannel.sendResponse()* with
> a
> > > > local
> > > > > > >> method
> > > > > > >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that
> does
> > > the
> > > > > > >> > throttling
> > > > > > >> > >> if
> > > > > > >> > >> > >> any
> > > > > > >> > >> > >> > > plus
> > > > > > >> > >> > >> > > > > send
> > > > > > >> > >> > >> > > > > > > > response. If we throttle first in
> > > > > > >> > *KafkaApis.handle()*,
> > > > > > >> > >> > the
> > > > > > >> > >> > >> > time
> > > > > > >> > >> > >> > > > > spent
> > > > > > >> > >> > >> > > > > > > > within the method handling the request
> > > will
> > > > > not
> > > > > > be
> > > > > > >> > >> > recorded
> > > > > > >> > >> > >> or
> > > > > > >> > >> > >> > > used
> > > > > > >> > >> > >> > > > > in
> > > > > > >> > >> > >> > > > > > > > throttling. We can look into this
> again
> > > when
> > > > > the
> > > > > > >> PR
> > > > > > >> > is
> > > > > > >> > >> > ready
> > > > > > >> > >> > >> > for
> > > > > > >> > >> > >> > > > > > review.
> > > > > > >> > >> > >> > > > > > > >
> > > > > > >> > >> > >> > > > > > > > Regards,
> > > > > > >> > >> > >> > > > > > > >
> > > > > > >> > >> > >> > > > > > > > Rajini
> > > > > > >> > >> > >> > > > > > > >
> > > > > > >> > >> > >> > > > > > > >
> > > > > > >> > >> > >> > > > > > > >
> > > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger
> > > > Hoover
> > > > > <
> > > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > > >> > >> > >> > > > > > > > wrote:
> > > > > > >> > >> > >> > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > Great to see this KIP and the
> > excellent
> > > > > > >> discussion.
> > > > > > >> > >> > >> > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.
> > If
> > > > my
> > > > > > >> > >> application
> > > > > > >> > >> > is
> > > > > > >> > >> > >> > > > > allocated
> > > > > > >> > >> > >> > > > > > 1
> > > > > > >> > >> > >> > > > > > > > > request handler unit, then it's as
> if
> > I
> > > > > have a
> > > > > > >> > Kafka
> > > > > > >> > >> > >> broker
> > > > > > >> > >> > >> > > with
> > > > > > >> > >> > >> > > > a
> > > > > > >> > >> > >> > > > > > > single
> > > > > > >> > >> > >> > > > > > > > > request handler thread dedicated to
> > me.
> > > > > > That's
> > > > > > >> the
> > > > > > >> > >> > most I
> > > > > > >> > >> > >> > can
> > > > > > >> > >> > >> > > > use,
> > > > > > >> > >> > >> > > > > > at
> > > > > > >> > >> > >> > > > > > > > > least.  That allocation doesn't
> change
> > > > even
> > > > > if
> > > > > > >> an
> > > > > > >> > >> admin
> > > > > > >> > >> > >> later
> > > > > > >> > >> > >> > > > > > increases
> > > > > > >> > >> > >> > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > size of the request thread pool on
> the
> > > > > broker.
> > > > > > >> > It's
> > > > > > >> > >> > >> similar
> > > > > > >> > >> > >> > to
> > > > > > >> > >> > >> > > > the
> > > > > > >> > >> > >> > > > > > CPU
> > > > > > >> > >> > >> > > > > > > > > abstraction that VMs and containers
> > get
> > > > from
> > > > > > >> > >> hypervisors
> > > > > > >> > >> > >> or
> > > > > > >> > >> > >> > OS
> > > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > > >> > >> > >> > > > > > > > > While different client access
> patterns
> > > can
> > > > > use
> > > > > > >> > wildly
> > > > > > >> > >> > >> > different
> > > > > > >> > >> > >> > > > > > amounts
> > > > > > >> > >> > >> > > > > > > > of
> > > > > > >> > >> > >> > > > > > > > > request thread resources per
> request,
> > a
> > > > > given
> > > > > > >> > >> > application
> > > > > > >> > >> > >> > will
> > > > > > >> > >> > >> > > > > > > generally
> > > > > > >> > >> > >> > > > > > > > > have a stable access pattern and can
> > > > figure
> > > > > > out
> > > > > > >> > >> > >> empirically
> > > > > > >> > >> > >> > how
> > > > > > >> > >> > >> > > > > many
> > > > > > >> > >> > >> > > > > > > > > "request thread units" it needs to
> > meet
> > > > it's
> > > > > > >> > >> > >> > throughput/latency
> > > > > > >> > >> > >> > > > > > goals.
> > > > > > >> > >> > >> > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > > >> > >> > >> > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > Roger
> > > > > > >> > >> > >> > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun
> > > Rao <
> > > > > > >> > >> > >> jun@confluent.io>
> > > > > > >> > >> > >> > > > wrote:
> > > > > > >> > >> > >> > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > > >> > >> > >> > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > Thanks for the updated KIP. A few
> > more
> > > > > > >> comments.
> > > > > > >> > >> > >> > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > 1. A concern of
> request_time_percent
> > > is
> > > > > that
> > > > > > >> it's
> > > > > > >> > >> not
> > > > > > >> > >> > an
> > > > > > >> > >> > >> > > > absolute
> > > > > > >> > >> > >> > > > > > > > value.
> > > > > > >> > >> > >> > > > > > > > > > Let's say you give a user a 10%
> > limit.
> > > > If
> > > > > > the
> > > > > > >> > admin
> > > > > > >> > >> > >> doubles
> > > > > > >> > >> > >> > > the
> > > > > > >> > >> > >> > > > > > > number
> > > > > > >> > >> > >> > > > > > > > of
> > > > > > >> > >> > >> > > > > > > > > > request handler threads, that user
> > now
> > > > > > >> actually
> > > > > > >> > has
> > > > > > >> > >> > >> twice
> > > > > > >> > >> > >> > the
> > > > > > >> > >> > >> > > > > > > absolute
> > > > > > >> > >> > >> > > > > > > > > > capacity. This may confuse people
> a
> > > bit.
> > > > > So,
> > > > > > >> > >> perhaps
> > > > > > >> > >> > >> > setting
> > > > > > >> > >> > >> > > > the
> > > > > > >> > >> > >> > > > > > > quota
> > > > > > >> > >> > >> > > > > > > > > > based on an absolute request
> thread
> > > unit
> > > > > is
> > > > > > >> > better.
> > > > > > >> > >> > >> > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is
> also
> > > an
> > > > > > >> > >> inter-broker
> > > > > > >> > >> > >> > request
> > > > > > >> > >> > >> > > > and
> > > > > > >> > >> > >> > > > > > > needs
> > > > > > >> > >> > >> > > > > > > > to
> > > > > > >> > >> > >> > > > > > > > > > be excluded from throttling.
> > > > > > >> > >> > >> > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > 3. Implementation wise, I am
> > wondering
> > > > if
> > > > > > it's
> > > > > > >> > >> simpler
> > > > > > >> > >> > >> to
> > > > > > >> > >> > >> > > apply
> > > > > > >> > >> > >> > > > > the
> > > > > > >> > >> > >> > > > > > > > > request
> > > > > > >> > >> > >> > > > > > > > > > time throttling first in
> > > > > KafkaApis.handle().
> > > > > > >> > >> > Otherwise,
> > > > > > >> > >> > >> we
> > > > > > >> > >> > >> > > will
> > > > > > >> > >> > >> > > > > > need
> > > > > > >> > >> > >> > > > > > > to
> > > > > > >> > >> > >> > > > > > > > > add
> > > > > > >> > >> > >> > > > > > > > > > the throttling logic in each type
> of
> > > > > > request.
> > > > > > >> > >> > >> > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > > >> > >> > >> > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > > >> > >> > >> > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM,
> > > Rajini
> > > > > > >> Sivaram <
> > > > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > > >> > >> > >> > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > > >> > >> > >> > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > Thank you for the review.
> > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > I have reverted to the original
> > KIP
> > > > that
> > > > > > >> > >> throttles
> > > > > > >> > >> > >> based
> > > > > > >> > >> > >> > on
> > > > > > >> > >> > >> > > > > > request
> > > > > > >> > >> > >> > > > > > > > > > handler
> > > > > > >> > >> > >> > > > > > > > > > > utilization. At the moment, it
> > uses
> > > > > > >> percentage,
> > > > > > >> > >> but
> > > > > > >> > >> > I
> > > > > > >> > >> > >> am
> > > > > > >> > >> > >> > > > happy
> > > > > > >> > >> > >> > > > > to
> > > > > > >> > >> > >> > > > > > > > > change
> > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > >> > >> > >> > > > > > > > > > > a fraction (out of 1 instead of
> > 100)
> > > > if
> > > > > > >> > >> required. I
> > > > > > >> > >> > >> have
> > > > > > >> > >> > >> > > > added
> > > > > > >> > >> > >> > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > examples
> > > > > > >> > >> > >> > > > > > > > > > > from this discussion to the KIP.
> > > Also
> > > > > > added
> > > > > > >> a
> > > > > > >> > >> > "Future
> > > > > > >> > >> > >> > Work"
> > > > > > >> > >> > >> > > > > > section
> > > > > > >> > >> > >> > > > > > > > to
> > > > > > >> > >> > >> > > > > > > > > > > address network thread
> > utilization.
> > > > The
> > > > > > >> > >> > configuration
> > > > > > >> > >> > >> is
> > > > > > >> > >> > >> > > > named
> > > > > > >> > >> > >> > > > > > > > > > > "request_time_percent" with the
> > > > > > expectation
> > > > > > >> > that
> > > > > > >> > >> it
> > > > > > >> > >> > >> can
> > > > > > >> > >> > >> > > also
> > > > > > >> > >> > >> > > > be
> > > > > > >> > >> > >> > > > > > > used
> > > > > > >> > >> > >> > > > > > > > as
> > > > > > >> > >> > >> > > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > limit for network thread
> > utilization
> > > > > when
> > > > > > >> that
> > > > > > >> > is
> > > > > > >> > >> > >> > > > implemented,
> > > > > > >> > >> > >> > > > > so
> > > > > > >> > >> > >> > > > > > > > that
> > > > > > >> > >> > >> > > > > > > > > > > users have to set only one
> config
> > > for
> > > > > the
> > > > > > >> two
> > > > > > >> > and
> > > > > > >> > >> > not
> > > > > > >> > >> > >> > have
> > > > > > >> > >> > >> > > to
> > > > > > >> > >> > >> > > > > > worry
> > > > > > >> > >> > >> > > > > > > > > about
> > > > > > >> > >> > >> > > > > > > > > > > the internal distribution of the
> > > work
> > > > > > >> between
> > > > > > >> > the
> > > > > > >> > >> > two
> > > > > > >> > >> > >> > > thread
> > > > > > >> > >> > >> > > > > > pools
> > > > > > >> > >> > >> > > > > > > in
> > > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23
> AM,
> > > Jun
> > > > > Rao
> > > > > > <
> > > > > > >> > >> > >> > > jun@confluent.io>
> > > > > > >> > >> > >> > > > > > > wrote:
> > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > The benefit of using the
> request
> > > > > > >> processing
> > > > > > >> > >> time
> > > > > > >> > >> > >> over
> > > > > > >> > >> > >> > the
> > > > > > >> > >> > >> > > > > > request
> > > > > > >> > >> > >> > > > > > > > > rate
> > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > >> > >> > >> > > > > > > > > > > > exactly what people have
> said. I
> > > > will
> > > > > > just
> > > > > > >> > >> expand
> > > > > > >> > >> > >> that
> > > > > > >> > >> > >> > a
> > > > > > >> > >> > >> > > > bit.
> > > > > > >> > >> > >> > > > > > > > > Consider
> > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > > following case. The producer
> > > sends a
> > > > > > >> produce
> > > > > > >> > >> > request
> > > > > > >> > >> > >> > > with a
> > > > > > >> > >> > >> > > > > > 10MB
> > > > > > >> > >> > >> > > > > > > > > > message
> > > > > > >> > >> > >> > > > > > > > > > > > but compressed to 100KB with
> > gzip.
> > > > The
> > > > > > >> > >> > >> decompression of
> > > > > > >> > >> > >> > > the
> > > > > > >> > >> > >> > > > > > > message
> > > > > > >> > >> > >> > > > > > > > > on
> > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > > broker could take 10-15
> seconds,
> > > > > during
> > > > > > >> which
> > > > > > >> > >> > time,
> > > > > > >> > >> > >> a
> > > > > > >> > >> > >> > > > request
> > > > > > >> > >> > >> > > > > > > > handler
> > > > > > >> > >> > >> > > > > > > > > > > > thread is completely blocked.
> In
> > > > this
> > > > > > >> case,
> > > > > > >> > >> > neither
> > > > > > >> > >> > >> the
> > > > > > >> > >> > >> > > > > byte-in
> > > > > > >> > >> > >> > > > > > > > quota
> > > > > > >> > >> > >> > > > > > > > > > nor
> > > > > > >> > >> > >> > > > > > > > > > > > the request rate quota may be
> > > > > effective
> > > > > > in
> > > > > > >> > >> > >> protecting
> > > > > > >> > >> > >> > the
> > > > > > >> > >> > >> > > > > > broker.
> > > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > > >> > >> > >> > > > > > > > > > > > another case. A consumer group
> > > > starts
> > > > > > >> with 10
> > > > > > >> > >> > >> instances
> > > > > > >> > >> > >> > > and
> > > > > > >> > >> > >> > > > > > later
> > > > > > >> > >> > >> > > > > > > > on
> > > > > > >> > >> > >> > > > > > > > > > > > switches to 20 instances. The
> > > > request
> > > > > > rate
> > > > > > >> > will
> > > > > > >> > >> > >> likely
> > > > > > >> > >> > >> > > > > double,
> > > > > > >> > >> > >> > > > > > > but
> > > > > > >> > >> > >> > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > > actually load on the broker
> may
> > > not
> > > > > > double
> > > > > > >> > >> since
> > > > > > >> > >> > >> each
> > > > > > >> > >> > >> > > fetch
> > > > > > >> > >> > >> > > > > > > request
> > > > > > >> > >> > >> > > > > > > > > > only
> > > > > > >> > >> > >> > > > > > > > > > > > contains half of the
> partitions.
> > > > > Request
> > > > > > >> rate
> > > > > > >> > >> > quota
> > > > > > >> > >> > >> may
> > > > > > >> > >> > >> > > not
> > > > > > >> > >> > >> > > > > be
> > > > > > >> > >> > >> > > > > > > easy
> > > > > > >> > >> > >> > > > > > > > > to
> > > > > > >> > >> > >> > > > > > > > > > > > configure in this case.
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > What we really want is to be
> > able
> > > to
> > > > > > >> prevent
> > > > > > >> > a
> > > > > > >> > >> > >> client
> > > > > > >> > >> > >> > > from
> > > > > > >> > >> > >> > > > > > using
> > > > > > >> > >> > >> > > > > > > > too
> > > > > > >> > >> > >> > > > > > > > > > much
> > > > > > >> > >> > >> > > > > > > > > > > > of the server side resources.
> In
> > > > this
> > > > > > >> > >> particular
> > > > > > >> > >> > >> KIP,
> > > > > > >> > >> > >> > > this
> > > > > > >> > >> > >> > > > > > > resource
> > > > > > >> > >> > >> > > > > > > > > is
> > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > > capacity of the request
> handler
> > > > > > threads. I
> > > > > > >> > >> agree
> > > > > > >> > >> > >> that
> > > > > > >> > >> > >> > it
> > > > > > >> > >> > >> > > > may
> > > > > > >> > >> > >> > > > > > not
> > > > > > >> > >> > >> > > > > > > be
> > > > > > >> > >> > >> > > > > > > > > > > > intuitive for the users to
> > > determine
> > > > > how
> > > > > > >> to
> > > > > > >> > set
> > > > > > >> > >> > the
> > > > > > >> > >> > >> > right
> > > > > > >> > >> > >> > > > > > limit.
> > > > > > >> > >> > >> > > > > > > > > > However,
> > > > > > >> > >> > >> > > > > > > > > > > > this is not completely new and
> > has
> > > > > been
> > > > > > >> done
> > > > > > >> > in
> > > > > > >> > >> > the
> > > > > > >> > >> > >> > > > container
> > > > > > >> > >> > >> > > > > > > world
> > > > > > >> > >> > >> > > > > > > > > > > > already. For example, Linux
> > > cgroup (
> > > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > > >> > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > > >> > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-
> > > > > cpu.html)
> > > > > > >> has
> > > > > > >> > >> the
> > > > > > >> > >> > >> > concept
> > > > > > >> > >> > >> > > of
> > > > > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > > >> > >> > >> > > > > > > > > > > > which specifies the total
> amount
> > > of
> > > > > time
> > > > > > >> in
> > > > > > >> > >> > >> > microseconds
> > > > > > >> > >> > >> > > > for
> > > > > > >> > >> > >> > > > > > > which
> > > > > > >> > >> > >> > > > > > > > > all
> > > > > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup can run
> > during a
> > > > one
> > > > > > >> second
> > > > > > >> > >> > >> period.
> > > > > > >> > >> > >> > We
> > > > > > >> > >> > >> > > > can
> > > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > > >> > >> > >> > > > > > > > > > > > model the request handler
> > threads
> > > > in a
> > > > > > >> > similar
> > > > > > >> > >> > way.
> > > > > > >> > >> > >> For
> > > > > > >> > >> > >> > > > > > example,
> > > > > > >> > >> > >> > > > > > > > each
> > > > > > >> > >> > >> > > > > > > > > > > > request handler thread can be
> 1
> > > > > request
> > > > > > >> > handler
> > > > > > >> > >> > unit
> > > > > > >> > >> > >> > and
> > > > > > >> > >> > >> > > > the
> > > > > > >> > >> > >> > > > > > > admin
> > > > > > >> > >> > >> > > > > > > > > can
> > > > > > >> > >> > >> > > > > > > > > > > > configure a limit on how many
> > > units
> > > > > (say
> > > > > > >> > 0.01)
> > > > > > >> > >> a
> > > > > > >> > >> > >> client
> > > > > > >> > >> > >> > > can
> > > > > > >> > >> > >> > > > > > have.
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > Regarding not throttling the
> > > > internal
> > > > > > >> broker
> > > > > > >> > to
> > > > > > >> > >> > >> broker
> > > > > > >> > >> > >> > > > > > requests.
> > > > > > >> > >> > >> > > > > > > We
> > > > > > >> > >> > >> > > > > > > > > > could
> > > > > > >> > >> > >> > > > > > > > > > > > do that. Alternatively, we
> could
> > > > just
> > > > > > let
> > > > > > >> the
> > > > > > >> > >> > admin
> > > > > > >> > >> > >> > > > > configure a
> > > > > > >> > >> > >> > > > > > > > high
> > > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > > >> > >> > >> > > > > > > > > > > > for the kafka user (it may not
> > be
> > > > able
> > > > > > to
> > > > > > >> do
> > > > > > >> > >> that
> > > > > > >> > >> > >> > easily
> > > > > > >> > >> > >> > > > > based
> > > > > > >> > >> > >> > > > > > on
> > > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > Ideally we want to be able to
> > > > protect
> > > > > > the
> > > > > > >> > >> > >> utilization
> > > > > > >> > >> > >> > of
> > > > > > >> > >> > >> > > > the
> > > > > > >> > >> > >> > > > > > > > network
> > > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > > >> > >> > >> > > > > > > > > > > > pool too. The difficult is
> > mostly
> > > > what
> > > > > > >> Rajini
> > > > > > >> > >> > said:
> > > > > > >> > >> > >> (1)
> > > > > > >> > >> > >> > > The
> > > > > > >> > >> > >> > > > > > > > mechanism
> > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > >> > >> > >> > > > > > > > > > > > throttling the requests is
> > through
> > > > > > >> Purgatory
> > > > > > >> > >> and
> > > > > > >> > >> > we
> > > > > > >> > >> > >> > will
> > > > > > >> > >> > >> > > > have
> > > > > > >> > >> > >> > > > > > to
> > > > > > >> > >> > >> > > > > > > > > think
> > > > > > >> > >> > >> > > > > > > > > > > > through how to integrate that
> > into
> > > > the
> > > > > > >> > network
> > > > > > >> > >> > >> layer.
> > > > > > >> > >> > >> > > (2)
> > > > > > >> > >> > >> > > > In
> > > > > > >> > >> > >> > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > network
> > > > > > >> > >> > >> > > > > > > > > > > > layer, currently we know the
> > user,
> > > > but
> > > > > > not
> > > > > > >> > the
> > > > > > >> > >> > >> clientId
> > > > > > >> > >> > >> > > of
> > > > > > >> > >> > >> > > > > the
> > > > > > >> > >> > >> > > > > > > > > request.
> > > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > > >> > >> > >> > > > > > > > > > > > it's a bit tricky to throttle
> > > based
> > > > on
> > > > > > >> > clientId
> > > > > > >> > >> > >> there.
> > > > > > >> > >> > >> > > > Plus,
> > > > > > >> > >> > >> > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > > >> > >> > >> > > > > > > > > > > > quota can already protect the
> > > > network
> > > > > > >> thread
> > > > > > >> > >> > >> > utilization
> > > > > > >> > >> > >> > > > for
> > > > > > >> > >> > >> > > > > > > fetch
> > > > > > >> > >> > >> > > > > > > > > > > > requests. So, if we can't
> figure
> > > out
> > > > > > this
> > > > > > >> > part
> > > > > > >> > >> > right
> > > > > > >> > >> > >> > now,
> > > > > > >> > >> > >> > > > > just
> > > > > > >> > >> > >> > > > > > > > > focusing
> > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > >> > >> > >> > > > > > > > > > > > the request handling threads
> for
> > > > this
> > > > > > KIP
> > > > > > >> is
> > > > > > >> > >> > still a
> > > > > > >> > >> > >> > > useful
> > > > > > >> > >> > >> > > > > > > > feature.
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27
> AM,
> > > > > Rajini
> > > > > > >> > >> Sivaram <
> > > > > > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > >> > >> > >> > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > Thank you all for the
> > feedback.
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have removed
> exemption
> > > for
> > > > > > >> consumer
> > > > > > >> > >> > >> heartbeat
> > > > > > >> > >> > >> > > etc.
> > > > > > >> > >> > >> > > > > > Agree
> > > > > > >> > >> > >> > > > > > > > > that
> > > > > > >> > >> > >> > > > > > > > > > > > > protecting the cluster is
> more
> > > > > > important
> > > > > > >> > than
> > > > > > >> > >> > >> > > protecting
> > > > > > >> > >> > >> > > > > > > > individual
> > > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > > >> > >> > >> > > > > > > > > > > > > Have retained the exemption
> > for
> > > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > > >> > >> > >> > > > > > etc,
> > > > > > >> > >> > >> > > > > > > > > these
> > > > > > >> > >> > >> > > > > > > > > > > are
> > > > > > >> > >> > >> > > > > > > > > > > > > throttled only if
> > authorization
> > > > > fails
> > > > > > >> (so
> > > > > > >> > >> can't
> > > > > > >> > >> > be
> > > > > > >> > >> > >> > used
> > > > > > >> > >> > >> > > > for
> > > > > > >> > >> > >> > > > > > DoS
> > > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > > >> > >> > >> > > > > > > > > > > > > a secure cluster, but allows
> > > > > > >> inter-broker
> > > > > > >> > >> > >> requests to
> > > > > > >> > >> > >> > > > > > complete
> > > > > > >> > >> > >> > > > > > > > > > without
> > > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > I will wait another day to
> see
> > > if
> > > > > > these
> > > > > > >> is
> > > > > > >> > >> any
> > > > > > >> > >> > >> > > objection
> > > > > > >> > >> > >> > > > to
> > > > > > >> > >> > >> > > > > > > > quotas
> > > > > > >> > >> > >> > > > > > > > > > > based
> > > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > > >> > >> > >> > > > > > > > > > > > > request processing time (as
> > > > opposed
> > > > > to
> > > > > > >> > >> request
> > > > > > >> > >> > >> rate)
> > > > > > >> > >> > >> > > and
> > > > > > >> > >> > >> > > > if
> > > > > > >> > >> > >> > > > > > > there
> > > > > > >> > >> > >> > > > > > > > > are
> > > > > > >> > >> > >> > > > > > > > > > > no
> > > > > > >> > >> > >> > > > > > > > > > > > > objections, I will revert to
> > the
> > > > > > >> original
> > > > > > >> > >> > proposal
> > > > > > >> > >> > >> > with
> > > > > > >> > >> > >> > > > > some
> > > > > > >> > >> > >> > > > > > > > > changes.
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > The original proposal was
> only
> > > > > > including
> > > > > > >> > the
> > > > > > >> > >> > time
> > > > > > >> > >> > >> > used
> > > > > > >> > >> > >> > > by
> > > > > > >> > >> > >> > > > > the
> > > > > > >> > >> > >> > > > > > > > > request
> > > > > > >> > >> > >> > > > > > > > > > > > > handler threads (that made
> > > > > calculation
> > > > > > >> > >> easy). I
> > > > > > >> > >> > >> think
> > > > > > >> > >> > >> > > the
> > > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > > >> > >> > >> > > > > > > > > > > > > include the time spent in
> the
> > > > > network
> > > > > > >> > >> threads as
> > > > > > >> > >> > >> well
> > > > > > >> > >> > >> > > > since
> > > > > > >> > >> > >> > > > > > > that
> > > > > > >> > >> > >> > > > > > > > > may
> > > > > > >> > >> > >> > > > > > > > > > be
> > > > > > >> > >> > >> > > > > > > > > > > > > significant. As Jay pointed
> > out,
> > > > it
> > > > > is
> > > > > > >> more
> > > > > > >> > >> > >> > complicated
> > > > > > >> > >> > >> > > > to
> > > > > > >> > >> > >> > > > > > > > > calculate
> > > > > > >> > >> > >> > > > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > > > total available CPU time and
> > > > convert
> > > > > > to
> > > > > > >> a
> > > > > > >> > >> ratio
> > > > > > >> > >> > >> when
> > > > > > >> > >> > >> > > > there
> > > > > > >> > >> > >> > > > > > *m*
> > > > > > >> > >> > >> > > > > > > > I/O
> > > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > > >> > >> > >> > > > > > > > > > > > > and *n* network threads.
> > > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > > >> > >> > >> > > )
> > > > > > >> > >> > >> > > > > may
> > > > > > >> > >> > >> > > > > > > > give
> > > > > > >> > >> > >> > > > > > > > > us
> > > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > > >> > >> > >> > > > > > > > > > > > > we want, but it can be very
> > > > > expensive
> > > > > > on
> > > > > > >> > some
> > > > > > >> > >> > >> > > platforms.
> > > > > > >> > >> > >> > > > As
> > > > > > >> > >> > >> > > > > > > > Becket
> > > > > > >> > >> > >> > > > > > > > > > and
> > > > > > >> > >> > >> > > > > > > > > > > > > Guozhang have pointed out,
> we
> > do
> > > > > have
> > > > > > >> > several
> > > > > > >> > >> > time
> > > > > > >> > >> > >> > > > > > measurements
> > > > > > >> > >> > >> > > > > > > > > > already
> > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > >> > >> > >> > > > > > > > > > > > > generating metrics that we
> > could
> > > > > use,
> > > > > > >> > though
> > > > > > >> > >> we
> > > > > > >> > >> > >> might
> > > > > > >> > >> > >> > > > want
> > > > > > >> > >> > >> > > > > to
> > > > > > >> > >> > >> > > > > > > > > switch
> > > > > > >> > >> > >> > > > > > > > > > to
> > > > > > >> > >> > >> > > > > > > > > > > > > nanoTime() instead of
> > > > > > >> currentTimeMillis()
> > > > > > >> > >> since
> > > > > > >> > >> > >> some
> > > > > > >> > >> > >> > of
> > > > > > >> > >> > >> > > > the
> > > > > > >> > >> > >> > > > > > > > values
> > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > >> > >> > >> > > > > > > > > > > > > small requests may be < 1ms.
> > But
> > > > > > rather
> > > > > > >> > than
> > > > > > >> > >> add
> > > > > > >> > >> > >> up
> > > > > > >> > >> > >> > the
> > > > > > >> > >> > >> > > > > time
> > > > > > >> > >> > >> > > > > > > > spent
> > > > > > >> > >> > >> > > > > > > > > in
> > > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > > >> > >> > >> > > > > > > > > > > > > thread and network thread,
> > > > wouldn't
> > > > > it
> > > > > > >> be
> > > > > > >> > >> better
> > > > > > >> > >> > >> to
> > > > > > >> > >> > >> > > > convert
> > > > > > >> > >> > >> > > > > > the
> > > > > > >> > >> > >> > > > > > > > > time
> > > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > > >> > >> > >> > > > > > > > > > > > > on each thread into a
> separate
> > > > > ratio?
> > > > > > >> UserA
> > > > > > >> > >> has
> > > > > > >> > >> > a
> > > > > > >> > >> > >> > > request
> > > > > > >> > >> > >> > > > > > quota
> > > > > > >> > >> > >> > > > > > > > of
> > > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > > >> > >> > >> > > > > > > > > > > > > we take that to mean that
> > UserA
> > > > can
> > > > > > use
> > > > > > >> 5%
> > > > > > >> > of
> > > > > > >> > >> > the
> > > > > > >> > >> > >> > time
> > > > > > >> > >> > >> > > on
> > > > > > >> > >> > >> > > > > > > network
> > > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > > >> > >> > >> > > > > > > > > > > > > and 5% of the time on I/O
> > > threads?
> > > > > If
> > > > > > >> > either
> > > > > > >> > >> is
> > > > > > >> > >> > >> > > exceeded,
> > > > > > >> > >> > >> > > > > the
> > > > > > >> > >> > >> > > > > > > > > > response
> > > > > > >> > >> > >> > > > > > > > > > > is
> > > > > > >> > >> > >> > > > > > > > > > > > > throttled - it would mean
> > > > > maintaining
> > > > > > >> two
> > > > > > >> > >> sets
> > > > > > >> > >> > of
> > > > > > >> > >> > >> > > metrics
> > > > > > >> > >> > >> > > > > for
> > > > > > >> > >> > >> > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > two
> > > > > > >> > >> > >> > > > > > > > > > > > > durations, but would result
> in
> > > > more
> > > > > > >> > >> meaningful
> > > > > > >> > >> > >> > ratios.
> > > > > > >> > >> > >> > > We
> > > > > > >> > >> > >> > > > > > could
> > > > > > >> > >> > >> > > > > > > > > > define
> > > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > > >> > >> > >> > > > > > > > > > > > > quota limits (UserA has 5%
> of
> > > > > request
> > > > > > >> > threads
> > > > > > >> > >> > and
> > > > > > >> > >> > >> 10%
> > > > > > >> > >> > >> > > of
> > > > > > >> > >> > >> > > > > > > network
> > > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > > >> > >> > >> > > > > > > > > > > > > but that seems unnecessary
> and
> > > > > harder
> > > > > > to
> > > > > > >> > >> explain
> > > > > > >> > >> > >> to
> > > > > > >> > >> > >> > > > users.
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > Back to why and how quotas
> are
> > > > > applied
> > > > > > >> to
> > > > > > >> > >> > network
> > > > > > >> > >> > >> > > thread
> > > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > > >> > >> > >> > > > > > > > > > > > > a) In the case of fetch,
> the
> > > time
> > > > > > >> spent in
> > > > > > >> > >> the
> > > > > > >> > >> > >> > network
> > > > > > >> > >> > >> > > > > > thread
> > > > > > >> > >> > >> > > > > > > > may
> > > > > > >> > >> > >> > > > > > > > > be
> > > > > > >> > >> > >> > > > > > > > > > > > > significant and I can see
> the
> > > need
> > > > > to
> > > > > > >> > include
> > > > > > >> > >> > >> this.
> > > > > > >> > >> > >> > Are
> > > > > > >> > >> > >> > > > > there
> > > > > > >> > >> > >> > > > > > > > other
> > > > > > >> > >> > >> > > > > > > > > > > > > requests where the network
> > > thread
> > > > > > >> > >> utilization is
> > > > > > >> > >> > >> > > > > significant?
> > > > > > >> > >> > >> > > > > > > In
> > > > > > >> > >> > >> > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > case
> > > > > > >> > >> > >> > > > > > > > > > > > > of fetch, request handler
> > thread
> > > > > > >> > utilization
> > > > > > >> > >> > would
> > > > > > >> > >> > >> > > > throttle
> > > > > > >> > >> > >> > > > > > > > clients
> > > > > > >> > >> > >> > > > > > > > > > > with
> > > > > > >> > >> > >> > > > > > > > > > > > > high request rate, low data
> > > volume
> > > > > and
> > > > > > >> > fetch
> > > > > > >> > >> > byte
> > > > > > >> > >> > >> > rate
> > > > > > >> > >> > >> > > > > quota
> > > > > > >> > >> > >> > > > > > > will
> > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > >> > >> > >> > > > > > > > > > > > > clients with high data
> volume.
> > > > > Network
> > > > > > >> > thread
> > > > > > >> > >> > >> > > utilization
> > > > > > >> > >> > >> > > > > is
> > > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > > >> > >> > >> > > > > > > > > > > > > proportional to the data
> > > volume. I
> > > > > am
> > > > > > >> > >> wondering
> > > > > > >> > >> > >> if we
> > > > > > >> > >> > >> > > > even
> > > > > > >> > >> > >> > > > > > need
> > > > > > >> > >> > >> > > > > > > > to
> > > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > > >> > >> > >> > > > > > > > > > > > > based on network thread
> > > > utilization
> > > > > or
> > > > > > >> > >> whether
> > > > > > >> > >> > the
> > > > > > >> > >> > >> > data
> > > > > > >> > >> > >> > > > > > volume
> > > > > > >> > >> > >> > > > > > > > > quota
> > > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > b) At the moment, we record
> > and
> > > > > check
> > > > > > >> for
> > > > > > >> > >> quota
> > > > > > >> > >> > >> > > violation
> > > > > > >> > >> > >> > > > > at
> > > > > > >> > >> > >> > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > same
> > > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > > >> > >> > >> > > > > > > > > > > > > If a quota is violated, the
> > > > response
> > > > > > is
> > > > > > >> > >> delayed.
> > > > > > >> > >> > >> > Using
> > > > > > >> > >> > >> > > > > Jay'e
> > > > > > >> > >> > >> > > > > > > > > example
> > > > > > >> > >> > >> > > > > > > > > > of
> > > > > > >> > >> > >> > > > > > > > > > > > > disk reads for fetches
> > happening
> > > > in
> > > > > > the
> > > > > > >> > >> network
> > > > > > >> > >> > >> > thread,
> > > > > > >> > >> > >> > > > We
> > > > > > >> > >> > >> > > > > > > can't
> > > > > > >> > >> > >> > > > > > > > > > record
> > > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > > >> > >> > >> > > > > > > > > > > > > delay a response after the
> > disk
> > > > > reads.
> > > > > > >> We
> > > > > > >> > >> could
> > > > > > >> > >> > >> > record
> > > > > > >> > >> > >> > > > the
> > > > > > >> > >> > >> > > > > > time
> > > > > > >> > >> > >> > > > > > > > > spent
> > > > > > >> > >> > >> > > > > > > > > > > on
> > > > > > >> > >> > >> > > > > > > > > > > > > the network thread when the
> > > > response
> > > > > > is
> > > > > > >> > >> complete
> > > > > > >> > >> > >> and
> > > > > > >> > >> > >> > > > > > introduce
> > > > > > >> > >> > >> > > > > > > a
> > > > > > >> > >> > >> > > > > > > > > > delay
> > > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > > >> > >> > >> > > > > > > > > > > > > handling a subsequent
> request
> > > > > > (separate
> > > > > > >> out
> > > > > > >> > >> > >> recording
> > > > > > >> > >> > >> > > and
> > > > > > >> > >> > >> > > > > > quota
> > > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > > >> > >> > >> > > > > > > > > > > > > handling in the case of
> > network
> > > > > thread
> > > > > > >> > >> > overload).
> > > > > > >> > >> > >> > Does
> > > > > > >> > >> > >> > > > that
> > > > > > >> > >> > >> > > > > > > make
> > > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58
> > AM,
> > > > > > Becket
> > > > > > >> > Qin <
> > > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that
> enforcing
> > > the
> > > > > CPU
> > > > > > >> time
> > > > > > >> > >> is a
> > > > > > >> > >> > >> > little
> > > > > > >> > >> > >> > > > > > > tricky. I
> > > > > > >> > >> > >> > > > > > > > > am
> > > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > > >> > >> > >> > > > > > > > > > > > > > that maybe we can use the
> > > > existing
> > > > > > >> > request
> > > > > > >> > >> > >> > > statistics.
> > > > > > >> > >> > >> > > > > They
> > > > > > >> > >> > >> > > > > > > are
> > > > > > >> > >> > >> > > > > > > > > > > already
> > > > > > >> > >> > >> > > > > > > > > > > > > > very detailed so we can
> > > probably
> > > > > see
> > > > > > >> the
> > > > > > >> > >> > >> > approximate
> > > > > > >> > >> > >> > > > CPU
> > > > > > >> > >> > >> > > > > > time
> > > > > > >> > >> > >> > > > > > > > > from
> > > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > > >> > >> > >> > > > > > > > > > > > > > something like
> (total_time -
> > > > > > >> > >> > >> > > > request/response_queue_time
> > > > > > >> > >> > >> > > > > -
> > > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > > I agree with Guozhang that
> > > when
> > > > a
> > > > > > >> user is
> > > > > > >> > >> > >> throttled
> > > > > > >> > >> > >> > > it
> > > > > > >> > >> > >> > > > is
> > > > > > >> > >> > >> > > > > > > > likely
> > > > > > >> > >> > >> > > > > > > > > > that
> > > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > > >> > >> > >> > > > > > > > > > > > > > need to see if anything
> has
> > > went
> > > > > > wrong
> > > > > > >> > >> first,
> > > > > > >> > >> > >> and
> > > > > > >> > >> > >> > if
> > > > > > >> > >> > >> > > > the
> > > > > > >> > >> > >> > > > > > > users
> > > > > > >> > >> > >> > > > > > > > > are
> > > > > > >> > >> > >> > > > > > > > > > > well
> > > > > > >> > >> > >> > > > > > > > > > > > > > behaving and just need
> more
> > > > > > >> resources, we
> > > > > > >> > >> will
> > > > > > >> > >> > >> have
> > > > > > >> > >> > >> > > to
> > > > > > >> > >> > >> > > > > bump
> > > > > > >> > >> > >> > > > > > > up
> > > > > > >> > >> > >> > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > > >> > >> > >> > > > > > > > > > > > > > for them. It is true that
> > > > > > >> pre-allocating
> > > > > > >> > >> CPU
> > > > > > >> > >> > >> time
> > > > > > >> > >> > >> > > quota
> > > > > > >> > >> > >> > > > > > > > precisely
> > > > > > >> > >> > >> > > > > > > > > > for
> > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > > > > users is difficult. So in
> > > > practice
> > > > > > it
> > > > > > >> > would
> > > > > > >> > >> > >> > probably
> > > > > > >> > >> > >> > > be
> > > > > > >> > >> > >> > > > > > more
> > > > > > >> > >> > >> > > > > > > > like
> > > > > > >> > >> > >> > > > > > > > > > > first
> > > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > > >> > >> > >> > > > > > > > > > > > > > a relative high protective
> > CPU
> > > > > time
> > > > > > >> quota
> > > > > > >> > >> for
> > > > > > >> > >> > >> > > everyone
> > > > > > >> > >> > >> > > > > and
> > > > > > >> > >> > >> > > > > > > > > increase
> > > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > > >> > >> > >> > > > > > > > > > > > > > for some individual
> clients
> > on
> > > > > > demand.
> > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at
> 5:48
> > > PM,
> > > > > > >> Guozhang
> > > > > > >> > >> > Wang <
> > > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > > >> > >> > >> > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > > > This is a great
> proposal,
> > > glad
> > > > > to
> > > > > > >> see
> > > > > > >> > it
> > > > > > >> > >> > >> > happening.
> > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU
> > > > > > >> throttling, or
> > > > > > >> > >> more
> > > > > > >> > >> > >> > > > > specifically
> > > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > > >> > >> > >> > > > > > > > > > > > > > > ratio instead of the
> > request
> > > > > rate
> > > > > > >> > >> throttling
> > > > > > >> > >> > >> as
> > > > > > >> > >> > >> > > well.
> > > > > > >> > >> > >> > > > > > > Becket
> > > > > > >> > >> > >> > > > > > > > > has
> > > > > > >> > >> > >> > > > > > > > > > > very
> > > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > > >> > >> > >> > > > > > > > > > > > > > > summed my rationales
> > above,
> > > > and
> > > > > > one
> > > > > > >> > >> thing to
> > > > > > >> > >> > >> add
> > > > > > >> > >> > >> > > here
> > > > > > >> > >> > >> > > > > is
> > > > > > >> > >> > >> > > > > > > that
> > > > > > >> > >> > >> > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > > >> > >> > >> > > > > > > > > > > > > > > has a good support for
> > both
> > > > > > >> "protecting
> > > > > > >> > >> > >> against
> > > > > > >> > >> > >> > > rogue
> > > > > > >> > >> > >> > > > > > > > clients"
> > > > > > >> > >> > >> > > > > > > > > as
> > > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> > > > > > >> multi-tenancy
> > > > > > >> > >> > usage":
> > > > > > >> > >> > >> > when
> > > > > > >> > >> > >> > > > > > > thinking
> > > > > > >> > >> > >> > > > > > > > > > about
> > > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > > >> > >> > >> > > > > > > > > > > > > > > explain this to the end
> > > > users, I
> > > > > > >> find
> > > > > > >> > it
> > > > > > >> > >> > >> actually
> > > > > > >> > >> > >> > > > more
> > > > > > >> > >> > >> > > > > > > > natural
> > > > > > >> > >> > >> > > > > > > > > > than
> > > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > > > > > > request rate since as
> > > > mentioned
> > > > > > >> above,
> > > > > > >> > >> > >> different
> > > > > > >> > >> > >> > > > > requests
> > > > > > >> > >> > >> > > > > > > > will
> > > > > > >> > >> > >> > > > > > > > > > have
> > > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > > >> > >> > >> > > > > > > > > > > > > > > different "cost", and
> > Kafka
> > > > > today
> > > > > > >> > already
> > > > > > >> > >> > have
> > > > > > >> > >> > >> > > > various
> > > > > > >> > >> > >> > > > > > > > request
> > > > > > >> > >> > >> > > > > > > > > > > types
> > > > > > >> > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin,
> > > > > metadata,
> > > > > > >> etc),
> > > > > > >> > >> > >> because
> > > > > > >> > >> > >> > of
> > > > > > >> > >> > >> > > > that
> > > > > > >> > >> > >> > > > > > the
> > > > > > >> > >> > >> > > > > > > > > > request
> > > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > > >> > >> > >> > > > > > > > > > > > > > > throttling may not be as
> > > > > effective
> > > > > > >> > >> unless it
> > > > > > >> > >> > >> is
> > > > > > >> > >> > >> > set
> > > > > > >> > >> > >> > > > > very
> > > > > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > > >> > >> > >> > > > > > > > > > > > > > > Regarding to user
> > reactions
> > > > when
> > > > > > >> they
> > > > > > >> > are
> > > > > > >> > >> > >> > > throttled,
> > > > > > >> > >> > >> > > > I
> > > > > > >> > >> > >> > > > > > > think
> > > > > > >> > >> > >> > > > > > > > it
> > > > > > >> > >> > >> > > > > > > > > > may
> > > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > > >> > >> > >> > > > > > > > > > > > > > > case-by-case, and need
> to
> > be
> > > > > > >> > discovered /
> > > > > > >> > >> > >> guided
> > > > > > >> > >> > >> > by
> > > > > > >> > >> > >> > > > > > looking
> > > > > > >> > >> > >> > > > > > > > at
> > > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > > >> > >> > >> > > > > > > > > > > > > > > metrics. So in other
> words
> > > > users
> > > > > > >> would
> > > > > > >> > >> not
> > > > > > >> > >> > >> expect
> > > > > > >> > >> > >> > > to
> > > > > > >> > >> > >> > > > > get
> > > > > > >> > >> > >> > > > > > > > > > additional
> > > > > > >> > >> > >> > > > > > > > > > > > > > > information by simply
> > being
> > > > told
> > > > > > >> "hey,
> > > > > > >> > >> you
> > > > > > >> > >> > are
> > > > > > >> > >> > >> > > > > > throttled",
> > > > > > >> > >> > >> > > > > > > > > which
> > > > > > >> > >> > >> > > > > > > > > > is
> > > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > > >> > >> > >> > > > > > > > > > > > > > > what throttling does;
> they
> > > > need
> > > > > to
> > > > > > >> > take a
> > > > > > >> > >> > >> > follow-up
> > > > > > >> > >> > >> > > > > step
> > > > > > >> > >> > >> > > > > > > and
> > > > > > >> > >> > >> > > > > > > > > see
> > > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > > >> > >> > >> > > > > > > > > > > > > > > throttled probably
> because
> > > of
> > > > > ..",
> > > > > > >> > which
> > > > > > >> > >> is
> > > > > > >> > >> > by
> > > > > > >> > >> > >> > > > looking
> > > > > > >> > >> > >> > > > > at
> > > > > > >> > >> > >> > > > > > > > other
> > > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > > >> > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm
> > > > > > bombarding
> > > > > > >> the
> > > > > > >> > >> > >> brokers
> > > > > > >> > >> > >> > > with
> > > > > > >> > >> > >> > > > >
> > > > > > >>
> > > > > > > ...
> > > > > > >
> > > > > > > [Message clipped]
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Jun,

Thank you for the explanation, I hadn't realized you meant percentage of
the total thread pool. If everyone is OK with Jun's suggestion, I will
update the KIP.

Thanks,

Rajini

On Tue, Mar 7, 2017 at 5:08 PM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Rajini,
>
> Let's take your example. Let's say a user sets the limit to 50%. I am not
> sure if it's better to apply the same percentage separately to network and
> io thread pool. For example, for produce requests, most of the time will be
> spent in the io threads whereas for fetch requests, most of the time will
> be in the network threads. So, using the same percentage in both thread
> pools means one of the pools' resource will be over allocated.
>
> An alternative way is to simply model network and io thread pool together.
> If you get 10 io threads and 5 network threads, you get 1500% request
> processing power. A 50% limit means a total of 750% processing power. We
> just add up the time a user request spent in either network or io thread.
> If that total exceeds 750% (doesn't matter whether it's spent more in
> network or io thread), the request will be throttled. This seems more
> general and is not sensitive to the current implementation detail of having
> a separate network and io thread pool. In the future, if the threading
> model changes, the same concept of quota can still be applied. For now,
> since it's a bit tricky to add the delay logic in the network thread pool,
> we could probably just do the delaying only in the io threads as you
> suggested earlier.
>
> There is still the orthogonal question of whether a quota of 50% is out of
> 100% or 100% * #total processing threads. My feeling is that the latter is
> slightly better based on my explanation earlier. The way to describe this
> quota to the users can be "share of elapsed request processing time on a
> single CPU" (similar to top).
>
> Thanks,
>
> Jun
>
>
> On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Jun,
> >
> > Agree about the two scenarios.
> >
> > But still not sure about a single quota covering both network threads and
> > I/O threads with per-thread quota. If there are 10 I/O threads and 5
> > network threads and I want to assign half the quota to userA, the quota
> > would be 750%. I imagine, internally, we would convert this to 500% for
> I/O
> > and 250% for network threads to allocate 50% of each pool.
> >
> > A couple of scenarios:
> >
> > 1. Admin adds 1 extra network thread. To retain 50%, admin needs to now
> > allocate 800% for each user. Or increase the quota for a few users. To
> me,
> > it feels like admin needs to convert 50% to 800% and Kafka internally
> needs
> > to convert 800% to (500%, 300%). Everyone using just 50% feels a lot
> > simpler.
> >
> > 2. We decide to add some other thread to this list. Admin needs to know
> > exactly how many threads form the maximum quota. And we can be changing
> > this between broker versions as we add more to the list. Again a single
> > overall percent would be a lot simpler.
> >
> > There were others who were unconvinced by a single percent from the
> initial
> > proposal and were happier with thread units similar to CPU units, so I am
> > ok with going with per-thread quotas (as units or percent). Just not sure
> > it makes it easier for admin in all cases.
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Rajini,
> > >
> > > Consider modeling as n * 100% unit. For 2), the question is what's
> > causing
> > > the I/O threads to be saturated. It's unlikely that all users'
> > utilization
> > > have increased at the same. A more likely case is that a few isolated
> > > users' utilization have increased. If so, after increasing the number
> of
> > > threads, the admin just needs to adjust the quota for a few isolated
> > users,
> > > which is expected and is less work.
> > >
> > > Consider modeling as 1 * 100% unit. For 1), all users' quota need to be
> > > adjusted, which is unexpected and is more work.
> > >
> > > So, to me, the n * 100% model seems more convenient.
> > >
> > > As for future extension to cover network thread utilization, I was
> > thinking
> > > that one way is to simply model the capacity as (n + m) * 100% unit,
> > where
> > > n and m are the number of network and i/o threads, respectively. Then,
> > for
> > > each user, we can just add up the utilization in the network and the
> i/o
> > > thread. If we do this, we don't need a new type of quota.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <
> rajinisivaram@gmail.com
> > >
> > > wrote:
> > >
> > > > Jun,
> > > >
> > > > If we use request.percentage as the percentage used in a single I/O
> > > thread,
> > > > the total percentage being allocated will be num.io.threads * 100 for
> > I/O
> > > > threads and num.network.threads * 100 for network threads. A single
> > quota
> > > > covering the two as a percentage wouldn't quite work if you want to
> > > > allocate the same proportion in both cases. If we want to treat
> threads
> > > as
> > > > separate units, won't we need two quota configurations regardless of
> > > > whether we use units or percentage? Perhaps I misunderstood your
> > > > suggestion.
> > > >
> > > > I think there are two cases:
> > > >
> > > >    1. The use case that you mentioned where an admin is adding more
> > users
> > > >    and decides to add more I/O threads and expects to find free quota
> > to
> > > >    allocate for new users.
> > > >    2. Admin adds more I/O threads because the I/O threads are
> saturated
> > > and
> > > >    there are cores available to allocate, even though the number or
> > > >    users/clients hasn't changed.
> > > >
> > > > If we allocated treated I/O threads as a single unit of 100%, all
> user
> > > > quotas need to be reallocated for 1). If we allocated I/O threads as
> n
> > > > units with n*100%, all user quotas need to be reallocated for 2),
> > > otherwise
> > > > some of the new threads may just not be used. Either way it should be
> > > easy
> > > > to write a script to decrease/increase quotas by a multiple for all
> > > users.
> > > >
> > > > So it really boils down to which quota unit is most intuitive in
> terms
> > of
> > > > configuration. And from the discussion so far, it feels like opinion
> is
> > > > divided on whether quotas should be carved out of an absolute 100%
> (or
> > 1
> > > > unit) or be relative to the number of threads (n*100% or n units).
> > > >
> > > >
> > > >
> > > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Another way to express an absolute limit is to use
> > request.percentage,
> > > > but
> > > > > treat it as the percentage used in a single request handling
> thread.
> > > For
> > > > > now, the request handling threads can be just the io threads. In
> the
> > > > > future, they can cover the network threads as well. This is similar
> > to
> > > > how
> > > > > top reports CPU usage and may be a bit easier for people to
> > understand.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <ju...@confluent.io>
> wrote:
> > > > >
> > > > > > Hi, Jay,
> > > > > >
> > > > > > 2. Regarding request.unit vs request.percentage. I started with
> > > > > > request.percentage too. The reasoning for request.unit is the
> > > > following.
> > > > > > Suppose that the capacity has been reached on a broker and the
> > admin
> > > > > needs
> > > > > > to add a new user. A simple way to increase the capacity is to
> > > increase
> > > > > the
> > > > > > number of io threads, assuming there are still enough cores. If
> the
> > > > limit
> > > > > > is based on percentage, the additional capacity automatically
> gets
> > > > > > distributed to existing users and we haven't really carved out
> any
> > > > > > additional resource for the new user. Now, is it easy for a user
> to
> > > > > reason
> > > > > > about 0.1 unit vs 10%. My feeling is that both are hard and have
> to
> > > be
> > > > > > configured empirically. Not sure if percentage is obviously
> easier
> > to
> > > > > > reason about.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io>
> > wrote:
> > > > > >
> > > > > >> A couple of quick points:
> > > > > >>
> > > > > >> 1. Even though the implementation of this quota is only using io
> > > > thread
> > > > > >> time, i think we should call it something like "request-time".
> > This
> > > > will
> > > > > >> give us flexibility to improve the implementation to cover
> network
> > > > > threads
> > > > > >> in the future and will avoid exposing internal details like our
> > > thread
> > > > > >> pools on the server.
> > > > > >>
> > > > > >> 2. Jun/Roger, I get what you are trying to fix but the idea of
> > > > > >> thread/units
> > > > > >> is super unintuitive as a user-facing knob. I had to read the
> KIP
> > > like
> > > > > >> eight times to understand this. I'm not sure that your point
> that
> > > > > >> increasing the number of threads is a problem with a
> > > percentage-based
> > > > > >> value, it really depends on whether the user thinks about the
> > > > > "percentage
> > > > > >> of request processing time" or "thread units". If they think "I
> > have
> > > > > >> allocated 10% of my request processing time to user x" then it
> is
> > a
> > > > bug
> > > > > >> that increasing the thread count decreases that percent as it
> does
> > > in
> > > > > the
> > > > > >> current proposal. As a practical matter I think the only way to
> > > > actually
> > > > > >> reason about this is as a percent---I just don't believe people
> > are
> > > > > going
> > > > > >> to think, "ah, 4.3 thread units, that is the right amount!".
> > > Instead I
> > > > > >> think they have to understand this thread unit concept, figure
> out
> > > > what
> > > > > >> they have set in number of threads, compute a percent and then
> > come
> > > up
> > > > > >> with
> > > > > >> the number of thread units, and these will all be wrong if that
> > > thread
> > > > > >> count changes. I also think this ties us to throttling the I/O
> > > thread
> > > > > >> pool,
> > > > > >> which may not be where we want to end up.
> > > > > >>
> > > > > >> 3. For what it's worth I do think having a single throttle_ms
> > field
> > > in
> > > > > all
> > > > > >> the responses that combines all throttling from all quotas is
> > > probably
> > > > > the
> > > > > >> simplest. There could be a use case for having separate fields
> for
> > > > each,
> > > > > >> but I think that is actually harder to use/monitor in the common
> > > case
> > > > so
> > > > > >> unless someone has a use case I think just one should be fine.
> > > > > >>
> > > > > >> -Jay
> > > > > >>
> > > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > > > rajinisivaram@gmail.com>
> > > > > >> wrote:
> > > > > >>
> > > > > >> > I have updated the KIP based on the discussions so far.
> > > > > >> >
> > > > > >> >
> > > > > >> > Regards,
> > > > > >> >
> > > > > >> > Rajini
> > > > > >> >
> > > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > > > >> rajinisivaram@gmail.com>
> > > > > >> > wrote:
> > > > > >> >
> > > > > >> > > Thank you all for the feedback.
> > > > > >> > >
> > > > > >> > > Ismael #1. It makes sense not to throttle inter-broker
> > requests
> > > > like
> > > > > >> > > LeaderAndIsr etc. The simplest way to ensure that clients
> > cannot
> > > > use
> > > > > >> > these
> > > > > >> > > requests to bypass quotas for DoS attacks is to ensure that
> > ACLs
> > > > > >> prevent
> > > > > >> > > clients from using these requests and unauthorized requests
> > are
> > > > > >> included
> > > > > >> > > towards quotas.
> > > > > >> > >
> > > > > >> > > Ismael #2, Jay #1 : I was thinking that these quotas can
> > return
> > > a
> > > > > >> > separate
> > > > > >> > > throttle time, and all utilization based quotas could use
> the
> > > same
> > > > > >> field
> > > > > >> > > (we won't add another one for network thread utilization for
> > > > > >> instance).
> > > > > >> > But
> > > > > >> > > perhaps it makes sense to keep byte rate quotas separate in
> > > > > >> produce/fetch
> > > > > >> > > responses to provide separate metrics? Agree with Ismael
> that
> > > the
> > > > > >> name of
> > > > > >> > > the existing field should be changed if we have two. Happy
> to
> > > > switch
> > > > > >> to a
> > > > > >> > > single combined throttle time if that is sufficient.
> > > > > >> > >
> > > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot separated
> > name
> > > > for
> > > > > >> new
> > > > > >> > > property. Replication quotas use dot separated, so it will
> be
> > > > > >> consistent
> > > > > >> > > with all properties except byte rate quotas.
> > > > > >> > >
> > > > > >> > > Radai: #1 Request processing time rather than request rate
> > were
> > > > > chosen
> > > > > >> > > because the time per request can vary significantly between
> > > > requests
> > > > > >> as
> > > > > >> > > mentioned in the discussion and KIP.
> > > > > >> > > #2 Two separate quotas for heartbeats/regular requests feel
> > like
> > > > > more
> > > > > >> > > configuration and more metrics. Since most users would set
> > > quotas
> > > > > >> higher
> > > > > >> > > than the expected usage and quotas are more of a safety
> net, a
> > > > > single
> > > > > >> > quota
> > > > > >> > > should work in most cases.
> > > > > >> > >  #3 The number of requests in purgatory is limited by the
> > number
> > > > of
> > > > > >> > active
> > > > > >> > > connections since only one request per connection will be
> > > > throttled
> > > > > >> at a
> > > > > >> > > time.
> > > > > >> > > #4 As with byte rate quotas, to use the full allocated
> quotas,
> > > > > >> > > clients/users would need to use partitions that are
> > distributed
> > > > > across
> > > > > >> > the
> > > > > >> > > cluster. The alternative of using cluster-wide quotas
> instead
> > of
> > > > > >> > per-broker
> > > > > >> > > quotas would be far too complex to implement.
> > > > > >> > >
> > > > > >> > > Dong : We currently have two ClientQuotaManagers for quota
> > types
> > > > > Fetch
> > > > > >> > and
> > > > > >> > > Produce. A new one will be added for IOThread, which manages
> > > > quotas
> > > > > >> for
> > > > > >> > I/O
> > > > > >> > > thread utilization. This will not update the Fetch or
> Produce
> > > > > >> queue-size,
> > > > > >> > > but will have a separate metric for the queue-size.  I
> wasn't
> > > > > >> planning to
> > > > > >> > > add any additional metrics apart from the equivalent ones
> for
> > > > > existing
> > > > > >> > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
> > > > > >> utilization
> > > > > >> > > could be slightly misleading since it depends on the
> sequence
> > of
> > > > > >> > requests.
> > > > > >> > > But we can look into more metrics after the KIP is
> implemented
> > > if
> > > > > >> > required.
> > > > > >> > >
> > > > > >> > > I think we need to limit the maximum delay since all
> requests
> > > are
> > > > > >> > > throttled. If a client has a quota of 0.001 units and a
> single
> > > > > request
> > > > > >> > used
> > > > > >> > > 50ms, we don't want to delay all requests from the client by
> > 50
> > > > > >> seconds,
> > > > > >> > > throwing the client out of all its consumer groups. The
> issue
> > is
> > > > > only
> > > > > >> if
> > > > > >> > a
> > > > > >> > > user is allocated a quota that is insufficient to process
> one
> > > > large
> > > > > >> > > request. The expectation is that the units allocated per
> user
> > > will
> > > > > be
> > > > > >> > much
> > > > > >> > > higher than the time taken to process one request and the
> > limit
> > > > > should
> > > > > >> > > seldom be applied. Agree this needs proper documentation.
> > > > > >> > >
> > > > > >> > > Regards,
> > > > > >> > >
> > > > > >> > > Rajini
> > > > > >> > >
> > > > > >> > >
> > > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > > radai.rosenblatt@gmail.com>
> > > > > >> > wrote:
> > > > > >> > >
> > > > > >> > >> @jun: i wasnt concerned about tying up a request processing
> > > > thread,
> > > > > >> but
> > > > > >> > >> IIUC the code does still read the entire request out, which
> > > might
> > > > > >> add-up
> > > > > >> > >> to
> > > > > >> > >> a non-negligible amount of memory.
> > > > > >> > >>
> > > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> > > lindong28@gmail.com>
> > > > > >> wrote:
> > > > > >> > >>
> > > > > >> > >> > Hey Rajini,
> > > > > >> > >> >
> > > > > >> > >> > The current KIP says that the maximum delay will be
> reduced
> > > to
> > > > > >> window
> > > > > >> > >> size
> > > > > >> > >> > if it is larger than the window size. I have a concern
> with
> > > > this:
> > > > > >> > >> >
> > > > > >> > >> > 1) This essentially means that the user is allowed to
> > exceed
> > > > > their
> > > > > >> > quota
> > > > > >> > >> > over a long period of time. Can you provide an upper
> bound
> > on
> > > > > this
> > > > > >> > >> > deviation?
> > > > > >> > >> >
> > > > > >> > >> > 2) What is the motivation for cap the maximum delay by
> the
> > > > window
> > > > > >> > size?
> > > > > >> > >> I
> > > > > >> > >> > am wondering if there is better alternative to address
> the
> > > > > problem.
> > > > > >> > >> >
> > > > > >> > >> > 3) It means that the existing metric-related config will
> > > have a
> > > > > >> more
> > > > > >> > >> > directly impact on the mechanism of this
> > io-thread-unit-based
> > > > > >> quota.
> > > > > >> > The
> > > > > >> > >> > may be an important change depending on the answer to 1)
> > > above.
> > > > > We
> > > > > >> > >> probably
> > > > > >> > >> > need to document this more explicitly.
> > > > > >> > >> >
> > > > > >> > >> > Dong
> > > > > >> > >> >
> > > > > >> > >> >
> > > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> > > > lindong28@gmail.com>
> > > > > >> > wrote:
> > > > > >> > >> >
> > > > > >> > >> > > Hey Jun,
> > > > > >> > >> > >
> > > > > >> > >> > > Yeah you are right. I thought it wasn't because at
> > LinkedIn
> > > > it
> > > > > >> will
> > > > > >> > be
> > > > > >> > >> > too
> > > > > >> > >> > > much pressure on inGraph to expose those per-clientId
> > > metrics
> > > > > so
> > > > > >> we
> > > > > >> > >> ended
> > > > > >> > >> > > up printing them periodically to local log. Never mind
> if
> > > it
> > > > is
> > > > > >> not
> > > > > >> > a
> > > > > >> > >> > > general problem.
> > > > > >> > >> > >
> > > > > >> > >> > > Hey Rajini,
> > > > > >> > >> > >
> > > > > >> > >> > > - I agree with Jay that we probably don't want to add a
> > new
> > > > > field
> > > > > >> > for
> > > > > >> > >> > > every quota ProduceResponse or FetchResponse. Is there
> > any
> > > > > >> use-case
> > > > > >> > >> for
> > > > > >> > >> > > having separate throttle-time fields for
> byte-rate-quota
> > > and
> > > > > >> > >> > > io-thread-unit-quota? You probably need to document
> this
> > as
> > > > > >> > interface
> > > > > >> > >> > > change if you plan to add new field in any request.
> > > > > >> > >> > >
> > > > > >> > >> > > - I don't think IOThread belongs to quotaType. The
> > existing
> > > > > quota
> > > > > >> > >> types
> > > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> > n/FollowerReplication)
> > > > > >> identify
> > > > > >> > >> the
> > > > > >> > >> > > type of request that are throttled, not the quota
> > mechanism
> > > > > that
> > > > > >> is
> > > > > >> > >> > applied.
> > > > > >> > >> > >
> > > > > >> > >> > > - If a request is throttled due to this
> > > io-thread-unit-based
> > > > > >> quota,
> > > > > >> > is
> > > > > >> > >> > the
> > > > > >> > >> > > existing queue-size metric in ClientQuotaManager
> > > incremented?
> > > > > >> > >> > >
> > > > > >> > >> > > - In the interest of providing guide line for admin to
> > > decide
> > > > > >> > >> > > io-thread-unit-based quota and for user to understand
> its
> > > > > impact
> > > > > >> on
> > > > > >> > >> their
> > > > > >> > >> > > traffic, would it be useful to have a metric that shows
> > the
> > > > > >> overall
> > > > > >> > >> > > byte-rate per io-thread-unit? Can we also show this a
> > > > > >> per-clientId
> > > > > >> > >> > metric?
> > > > > >> > >> > >
> > > > > >> > >> > > Thanks,
> > > > > >> > >> > > Dong
> > > > > >> > >> > >
> > > > > >> > >> > >
> > > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <
> > jun@confluent.io
> > > >
> > > > > >> wrote:
> > > > > >> > >> > >
> > > > > >> > >> > >> Hi, Ismael,
> > > > > >> > >> > >>
> > > > > >> > >> > >> For #3, typically, an admin won't configure more io
> > > threads
> > > > > than
> > > > > >> > CPU
> > > > > >> > >> > >> cores,
> > > > > >> > >> > >> but it's possible for an admin to start with fewer io
> > > > threads
> > > > > >> than
> > > > > >> > >> cores
> > > > > >> > >> > >> and grow that later on.
> > > > > >> > >> > >>
> > > > > >> > >> > >> Hi, Dong,
> > > > > >> > >> > >>
> > > > > >> > >> > >> I think the throttleTime sensor on the broker tells
> the
> > > > admin
> > > > > >> > >> whether a
> > > > > >> > >> > >> user/clentId is throttled or not.
> > > > > >> > >> > >>
> > > > > >> > >> > >> Hi, Radi,
> > > > > >> > >> > >>
> > > > > >> > >> > >> The reasoning for delaying the throttled requests on
> the
> > > > > broker
> > > > > >> > >> instead
> > > > > >> > >> > of
> > > > > >> > >> > >> returning an error immediately is that the latter has
> no
> > > way
> > > > > to
> > > > > >> > >> prevent
> > > > > >> > >> > >> the
> > > > > >> > >> > >> client from retrying immediately, which will make
> things
> > > > > worse.
> > > > > >> The
> > > > > >> > >> > >> delaying logic is based off a delay queue. A separate
> > > > > expiration
> > > > > >> > >> thread
> > > > > >> > >> > >> just waits on the next to be expired request. So, it
> > > doesn't
> > > > > tie
> > > > > >> > up a
> > > > > >> > >> > >> request handler thread.
> > > > > >> > >> > >>
> > > > > >> > >> > >> Thanks,
> > > > > >> > >> > >>
> > > > > >> > >> > >> Jun
> > > > > >> > >> > >>
> > > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> > > > > ismael@juma.me.uk
> > > > > >> >
> > > > > >> > >> wrote:
> > > > > >> > >> > >>
> > > > > >> > >> > >> > Hi Jay,
> > > > > >> > >> > >> >
> > > > > >> > >> > >> > Regarding 1, I definitely like the simplicity of
> > > keeping a
> > > > > >> single
> > > > > >> > >> > >> throttle
> > > > > >> > >> > >> > time field in the response. The downside is that the
> > > > client
> > > > > >> > metrics
> > > > > >> > >> > >> will be
> > > > > >> > >> > >> > more coarse grained.
> > > > > >> > >> > >> >
> > > > > >> > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.
> > > > > percentage`
> > > > > >> > and
> > > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > > >> > >> > >> >
> > > > > >> > >> > >> > Ismael
> > > > > >> > >> > >> >
> > > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> > > > > jay@confluent.io>
> > > > > >> > >> wrote:
> > > > > >> > >> > >> >
> > > > > >> > >> > >> > > A few minor comments:
> > > > > >> > >> > >> > >
> > > > > >> > >> > >> > >    1. Isn't it the case that the throttling time
> > > > response
> > > > > >> field
> > > > > >> > >> > should
> > > > > >> > >> > >> > have
> > > > > >> > >> > >> > >    the total time your request was throttled
> > > > irrespective
> > > > > of
> > > > > >> > the
> > > > > >> > >> > >> quotas
> > > > > >> > >> > >> > > that
> > > > > >> > >> > >> > >    caused that. Limiting it to byte rate quota
> > doesn't
> > > > > make
> > > > > >> > >> sense,
> > > > > >> > >> > >> but I
> > > > > >> > >> > >> > > also
> > > > > >> > >> > >> > >    I don't think we want to end up adding new
> fields
> > > in
> > > > > the
> > > > > >> > >> response
> > > > > >> > >> > >> for
> > > > > >> > >> > >> > > every
> > > > > >> > >> > >> > >    single thing we quota, right?
> > > > > >> > >> > >> > >    2. I don't think we should make this quota
> > > > specifically
> > > > > >> > about
> > > > > >> > >> io
> > > > > >> > >> > >> > >    threads. Once we introduce these quotas people
> > set
> > > > them
> > > > > >> and
> > > > > >> > >> > expect
> > > > > >> > >> > >> > them
> > > > > >> > >> > >> > > to
> > > > > >> > >> > >> > >    be enforced (and if they aren't it may cause an
> > > > > outage).
> > > > > >> As
> > > > > >> > a
> > > > > >> > >> > >> result
> > > > > >> > >> > >> > > they
> > > > > >> > >> > >> > >    are a bit more sensitive than normal configs, I
> > > > think.
> > > > > >> The
> > > > > >> > >> > current
> > > > > >> > >> > >> > > thread
> > > > > >> > >> > >> > >    pools seem like something of an implementation
> > > detail
> > > > > and
> > > > > >> > not
> > > > > >> > >> the
> > > > > >> > >> > >> > level
> > > > > >> > >> > >> > > the
> > > > > >> > >> > >> > >    user-facing quotas should be involved with. I
> > think
> > > > it
> > > > > >> might
> > > > > >> > >> be
> > > > > >> > >> > >> better
> > > > > >> > >> > >> > > to
> > > > > >> > >> > >> > >    make this a general request-time throttle with
> no
> > > > > >> mention in
> > > > > >> > >> the
> > > > > >> > >> > >> > naming
> > > > > >> > >> > >> > >    about I/O threads and simply acknowledge the
> > > current
> > > > > >> > >> limitation
> > > > > >> > >> > >> (which
> > > > > >> > >> > >> > > we
> > > > > >> > >> > >> > >    may someday fix) in the docs that this covers
> > only
> > > > the
> > > > > >> time
> > > > > >> > >> after
> > > > > >> > >> > >> the
> > > > > >> > >> > >> > >    thread is read off the network.
> > > > > >> > >> > >> > >    3. As such I think the right interface to the
> > user
> > > > > would
> > > > > >> be
> > > > > >> > >> > >> something
> > > > > >> > >> > >> > >    like percent_request_time and be in {0,...100}
> or
> > > > > >> > >> > >> request_time_ratio
> > > > > >> > >> > >> > > and be
> > > > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the
> > > terminology
> > > > we
> > > > > >> used
> > > > > >> > >> if
> > > > > >> > >> > the
> > > > > >> > >> > >> > > scale
> > > > > >> > >> > >> > >    is between 0 and 1 in the other metrics,
> right?)
> > > > > >> > >> > >> > >
> > > > > >> > >> > >> > > -Jay
> > > > > >> > >> > >> > >
> > > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > > > > >> > >> > >> rajinisivaram@gmail.com
> > > > > >> > >> > >> > >
> > > > > >> > >> > >> > > wrote:
> > > > > >> > >> > >> > >
> > > > > >> > >> > >> > > > Guozhang/Dong,
> > > > > >> > >> > >> > > >
> > > > > >> > >> > >> > > > Thank you for the feedback.
> > > > > >> > >> > >> > > >
> > > > > >> > >> > >> > > > Guozhang : I have updated the section on
> > > co-existence
> > > > of
> > > > > >> byte
> > > > > >> > >> rate
> > > > > >> > >> > >> and
> > > > > >> > >> > >> > > > request time quotas.
> > > > > >> > >> > >> > > >
> > > > > >> > >> > >> > > > Dong: I hadn't added much detail to the metrics
> > and
> > > > > >> sensors
> > > > > >> > >> since
> > > > > >> > >> > >> they
> > > > > >> > >> > >> > > are
> > > > > >> > >> > >> > > > going to be very similar to the existing metrics
> > and
> > > > > >> sensors.
> > > > > >> > >> To
> > > > > >> > >> > >> avoid
> > > > > >> > >> > >> > > > confusion, I have now added more detail. All
> > metrics
> > > > are
> > > > > >> in
> > > > > >> > the
> > > > > >> > >> > >> group
> > > > > >> > >> > >> > > > "quotaType" and all sensors have names starting
> > with
> > > > > >> > >> "quotaType"
> > > > > >> > >> > >> (where
> > > > > >> > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > > > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > > > >> > >> > >> > > > So there will be no reuse of existing
> > > metrics/sensors.
> > > > > The
> > > > > >> > new
> > > > > >> > >> > ones
> > > > > >> > >> > >> for
> > > > > >> > >> > >> > > > request processing time based throttling will be
> > > > > >> completely
> > > > > >> > >> > >> independent
> > > > > >> > >> > >> > > of
> > > > > >> > >> > >> > > > existing metrics/sensors, but will be consistent
> > in
> > > > > >> format.
> > > > > >> > >> > >> > > >
> > > > > >> > >> > >> > > > The existing throttle_time_ms field in
> > produce/fetch
> > > > > >> > responses
> > > > > >> > >> > will
> > > > > >> > >> > >> not
> > > > > >> > >> > >> > > be
> > > > > >> > >> > >> > > > impacted by this KIP. That will continue to
> return
> > > > > >> byte-rate
> > > > > >> > >> based
> > > > > >> > >> > >> > > > throttling times. In addition, a new field
> > > > > >> > >> > request_throttle_time_ms
> > > > > >> > >> > >> > will
> > > > > >> > >> > >> > > be
> > > > > >> > >> > >> > > > added to return request quota based throttling
> > > times.
> > > > > >> These
> > > > > >> > >> will
> > > > > >> > >> > be
> > > > > >> > >> > >> > > exposed
> > > > > >> > >> > >> > > > as new metrics on the client-side.
> > > > > >> > >> > >> > > >
> > > > > >> > >> > >> > > > Since all metrics and sensors are different for
> > each
> > > > > type
> > > > > >> of
> > > > > >> > >> > quota,
> > > > > >> > >> > >> I
> > > > > >> > >> > >> > > > believe there is already sufficient metrics to
> > > monitor
> > > > > >> > >> throttling
> > > > > >> > >> > on
> > > > > >> > >> > >> > both
> > > > > >> > >> > >> > > > client and broker side for each type of
> > throttling.
> > > > > >> > >> > >> > > >
> > > > > >> > >> > >> > > > Regards,
> > > > > >> > >> > >> > > >
> > > > > >> > >> > >> > > > Rajini
> > > > > >> > >> > >> > > >
> > > > > >> > >> > >> > > >
> > > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > > > > >> > lindong28@gmail.com
> > > > > >> > >> >
> > > > > >> > >> > >> wrote:
> > > > > >> > >> > >> > > >
> > > > > >> > >> > >> > > > > Hey Rajini,
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > > I think it makes a lot of sense to use
> > > > io_thread_units
> > > > > >> as
> > > > > >> > >> metric
> > > > > >> > >> > >> to
> > > > > >> > >> > >> > > quota
> > > > > >> > >> > >> > > > > user's traffic here. LGTM overall. I have some
> > > > > questions
> > > > > >> > >> > regarding
> > > > > >> > >> > >> > > > sensors.
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > > - Can you be more specific in the KIP what
> > sensors
> > > > > will
> > > > > >> be
> > > > > >> > >> > added?
> > > > > >> > >> > >> For
> > > > > >> > >> > >> > > > > example, it will be useful to specify the name
> > and
> > > > > >> > >> attributes of
> > > > > >> > >> > >> > these
> > > > > >> > >> > >> > > > new
> > > > > >> > >> > >> > > > > sensors.
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > > - We currently have throttle-time and
> queue-size
> > > for
> > > > > >> > >> byte-rate
> > > > > >> > >> > >> based
> > > > > >> > >> > >> > > > quota.
> > > > > >> > >> > >> > > > > Are you going to have separate throttle-time
> and
> > > > > >> queue-size
> > > > > >> > >> for
> > > > > >> > >> > >> > > requests
> > > > > >> > >> > >> > > > > throttled by io_thread_unit-based quota, or
> will
> > > > they
> > > > > >> share
> > > > > >> > >> the
> > > > > >> > >> > >> same
> > > > > >> > >> > >> > > > > sensor?
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > > - Does the throttle-time in the
> ProduceResponse
> > > and
> > > > > >> > >> > FetchResponse
> > > > > >> > >> > >> > > > contains
> > > > > >> > >> > >> > > > > time due to io_thread_unit-based quota?
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > > - Currently kafka server doesn't not provide
> any
> > > log
> > > > > or
> > > > > >> > >> metrics
> > > > > >> > >> > >> that
> > > > > >> > >> > >> > > > tells
> > > > > >> > >> > >> > > > > whether any given clientId (or user) is
> > throttled.
> > > > > This
> > > > > >> is
> > > > > >> > >> not
> > > > > >> > >> > too
> > > > > >> > >> > >> > bad
> > > > > >> > >> > >> > > > > because we can still check the client-side
> > > byte-rate
> > > > > >> metric
> > > > > >> > >> to
> > > > > >> > >> > >> > validate
> > > > > >> > >> > >> > > > > whether a given client is throttled. But with
> > this
> > > > > >> > >> > io_thread_unit,
> > > > > >> > >> > >> > > there
> > > > > >> > >> > >> > > > > will be no way to validate whether a given
> > client
> > > is
> > > > > >> slow
> > > > > >> > >> > because
> > > > > >> > >> > >> it
> > > > > >> > >> > >> > > has
> > > > > >> > >> > >> > > > > exceeded its io_thread_unit limit. It is
> > necessary
> > > > for
> > > > > >> user
> > > > > >> > >> to
> > > > > >> > >> > be
> > > > > >> > >> > >> > able
> > > > > >> > >> > >> > > to
> > > > > >> > >> > >> > > > > know this information to figure how whether
> they
> > > > have
> > > > > >> > reached
> > > > > >> > >> > >> there
> > > > > >> > >> > >> > > quota
> > > > > >> > >> > >> > > > > limit. How about we add log4j log on the
> server
> > > side
> > > > > to
> > > > > >> > >> > >> periodically
> > > > > >> > >> > >> > > > print
> > > > > >> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > > >> > >> > >> > > so
> > > > > >> > >> > >> > > > > that kafka administrator can figure those
> users
> > > that
> > > > > >> have
> > > > > >> > >> > reached
> > > > > >> > >> > >> > their
> > > > > >> > >> > >> > > > > limit and act accordingly?
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > > Thanks,
> > > > > >> > >> > >> > > > > Dong
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang
> Wang <
> > > > > >> > >> > >> wangguoz@gmail.com>
> > > > > >> > >> > >> > > > wrote:
> > > > > >> > >> > >> > > > >
> > > > > >> > >> > >> > > > > > Made a pass over the doc, overall LGTM
> except
> > a
> > > > > minor
> > > > > >> > >> comment
> > > > > >> > >> > on
> > > > > >> > >> > >> > the
> > > > > >> > >> > >> > > > > > throttling implementation:
> > > > > >> > >> > >> > > > > >
> > > > > >> > >> > >> > > > > > Stated as "Request processing time
> throttling
> > > will
> > > > > be
> > > > > >> > >> applied
> > > > > >> > >> > on
> > > > > >> > >> > >> > top
> > > > > >> > >> > >> > > if
> > > > > >> > >> > >> > > > > > necessary." I thought that it meant the
> > request
> > > > > >> > processing
> > > > > >> > >> > time
> > > > > >> > >> > >> > > > > throttling
> > > > > >> > >> > >> > > > > > is applied first, but continue reading I
> found
> > > it
> > > > > >> > actually
> > > > > >> > >> > >> meant to
> > > > > >> > >> > >> > > > apply
> > > > > >> > >> > >> > > > > > produce / fetch byte rate throttling first.
> > > > > >> > >> > >> > > > > >
> > > > > >> > >> > >> > > > > > Also the last sentence "The remaining delay
> if
> > > any
> > > > > is
> > > > > >> > >> applied
> > > > > >> > >> > to
> > > > > >> > >> > >> > the
> > > > > >> > >> > >> > > > > > response." is a bit confusing to me. Maybe
> > > > rewording
> > > > > >> it a
> > > > > >> > >> bit?
> > > > > >> > >> > >> > > > > >
> > > > > >> > >> > >> > > > > >
> > > > > >> > >> > >> > > > > > Guozhang
> > > > > >> > >> > >> > > > > >
> > > > > >> > >> > >> > > > > >
> > > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> > > > > >> > jun@confluent.io
> > > > > >> > >> >
> > > > > >> > >> > >> wrote:
> > > > > >> > >> > >> > > > > >
> > > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > > >> > >> > >> > > > > > >
> > > > > >> > >> > >> > > > > > > Thanks for the updated KIP. The latest
> > > proposal
> > > > > >> looks
> > > > > >> > >> good
> > > > > >> > >> > to
> > > > > >> > >> > >> me.
> > > > > >> > >> > >> > > > > > >
> > > > > >> > >> > >> > > > > > > Jun
> > > > > >> > >> > >> > > > > > >
> > > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini
> > > Sivaram
> > > > <
> > > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > > >> > >> > >> > > > > > >
> > > > > >> > >> > >> > > > > > > wrote:
> > > > > >> > >> > >> > > > > > >
> > > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > > >> > >> > >> > > > > > > >
> > > > > >> > >> > >> > > > > > > > Thank you for the feedback.
> > > > > >> > >> > >> > > > > > > >
> > > > > >> > >> > >> > > > > > > > 1. I have updated the KIP to use
> absolute
> > > > units
> > > > > >> > >> instead of
> > > > > >> > >> > >> > > > > percentage.
> > > > > >> > >> > >> > > > > > > The
> > > > > >> > >> > >> > > > > > > > property is called* io_thread_units* to
> > > align
> > > > > with
> > > > > >> > the
> > > > > >> > >> > >> thread
> > > > > >> > >> > >> > > count
> > > > > >> > >> > >> > > > > > > > property *num.io.threads*. When we
> > implement
> > > > > >> network
> > > > > >> > >> > thread
> > > > > >> > >> > >> > > > > utilization
> > > > > >> > >> > >> > > > > > > > quotas, we can add another property
> > > > > >> > >> > *network_thread_units.*
> > > > > >> > >> > >> > > > > > > >
> > > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is already listed
> > > under
> > > > > the
> > > > > >> > >> exempt
> > > > > >> > >> > >> > > requests.
> > > > > >> > >> > >> > > > > Jun,
> > > > > >> > >> > >> > > > > > > did
> > > > > >> > >> > >> > > > > > > > you mean a different request that needs
> to
> > > be
> > > > > >> added?
> > > > > >> > >> The
> > > > > >> > >> > >> four
> > > > > >> > >> > >> > > > > requests
> > > > > >> > >> > >> > > > > > > > currently exempt in the KIP are
> > StopReplica,
> > > > > >> > >> > >> > ControlledShutdown,
> > > > > >> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These
> are
> > > > > >> controlled
> > > > > >> > >> > using
> > > > > >> > >> > >> > > > > > ClusterAction
> > > > > >> > >> > >> > > > > > > > ACL, so it is easy to exclude and only
> > > > throttle
> > > > > if
> > > > > >> > >> > >> > unauthorized.
> > > > > >> > >> > >> > > I
> > > > > >> > >> > >> > > > > > wasn't
> > > > > >> > >> > >> > > > > > > > sure if there are other requests used
> only
> > > for
> > > > > >> > >> > inter-broker
> > > > > >> > >> > >> > that
> > > > > >> > >> > >> > > > > needed
> > > > > >> > >> > >> > > > > > > to
> > > > > >> > >> > >> > > > > > > > be excluded.
> > > > > >> > >> > >> > > > > > > >
> > > > > >> > >> > >> > > > > > > > 3. I was thinking the smallest change
> > would
> > > be
> > > > > to
> > > > > >> > >> replace
> > > > > >> > >> > >> all
> > > > > >> > >> > >> > > > > > references
> > > > > >> > >> > >> > > > > > > to
> > > > > >> > >> > >> > > > > > > > *requestChannel.sendResponse()* with a
> > > local
> > > > > >> method
> > > > > >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does
> > the
> > > > > >> > throttling
> > > > > >> > >> if
> > > > > >> > >> > >> any
> > > > > >> > >> > >> > > plus
> > > > > >> > >> > >> > > > > send
> > > > > >> > >> > >> > > > > > > > response. If we throttle first in
> > > > > >> > *KafkaApis.handle()*,
> > > > > >> > >> > the
> > > > > >> > >> > >> > time
> > > > > >> > >> > >> > > > > spent
> > > > > >> > >> > >> > > > > > > > within the method handling the request
> > will
> > > > not
> > > > > be
> > > > > >> > >> > recorded
> > > > > >> > >> > >> or
> > > > > >> > >> > >> > > used
> > > > > >> > >> > >> > > > > in
> > > > > >> > >> > >> > > > > > > > throttling. We can look into this again
> > when
> > > > the
> > > > > >> PR
> > > > > >> > is
> > > > > >> > >> > ready
> > > > > >> > >> > >> > for
> > > > > >> > >> > >> > > > > > review.
> > > > > >> > >> > >> > > > > > > >
> > > > > >> > >> > >> > > > > > > > Regards,
> > > > > >> > >> > >> > > > > > > >
> > > > > >> > >> > >> > > > > > > > Rajini
> > > > > >> > >> > >> > > > > > > >
> > > > > >> > >> > >> > > > > > > >
> > > > > >> > >> > >> > > > > > > >
> > > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger
> > > Hoover
> > > > <
> > > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > > >> > >> > >> > > > > > > > wrote:
> > > > > >> > >> > >> > > > > > > >
> > > > > >> > >> > >> > > > > > > > > Great to see this KIP and the
> excellent
> > > > > >> discussion.
> > > > > >> > >> > >> > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.
> If
> > > my
> > > > > >> > >> application
> > > > > >> > >> > is
> > > > > >> > >> > >> > > > > allocated
> > > > > >> > >> > >> > > > > > 1
> > > > > >> > >> > >> > > > > > > > > request handler unit, then it's as if
> I
> > > > have a
> > > > > >> > Kafka
> > > > > >> > >> > >> broker
> > > > > >> > >> > >> > > with
> > > > > >> > >> > >> > > > a
> > > > > >> > >> > >> > > > > > > single
> > > > > >> > >> > >> > > > > > > > > request handler thread dedicated to
> me.
> > > > > That's
> > > > > >> the
> > > > > >> > >> > most I
> > > > > >> > >> > >> > can
> > > > > >> > >> > >> > > > use,
> > > > > >> > >> > >> > > > > > at
> > > > > >> > >> > >> > > > > > > > > least.  That allocation doesn't change
> > > even
> > > > if
> > > > > >> an
> > > > > >> > >> admin
> > > > > >> > >> > >> later
> > > > > >> > >> > >> > > > > > increases
> > > > > >> > >> > >> > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > size of the request thread pool on the
> > > > broker.
> > > > > >> > It's
> > > > > >> > >> > >> similar
> > > > > >> > >> > >> > to
> > > > > >> > >> > >> > > > the
> > > > > >> > >> > >> > > > > > CPU
> > > > > >> > >> > >> > > > > > > > > abstraction that VMs and containers
> get
> > > from
> > > > > >> > >> hypervisors
> > > > > >> > >> > >> or
> > > > > >> > >> > >> > OS
> > > > > >> > >> > >> > > > > > > > schedulers.
> > > > > >> > >> > >> > > > > > > > > While different client access patterns
> > can
> > > > use
> > > > > >> > wildly
> > > > > >> > >> > >> > different
> > > > > >> > >> > >> > > > > > amounts
> > > > > >> > >> > >> > > > > > > > of
> > > > > >> > >> > >> > > > > > > > > request thread resources per request,
> a
> > > > given
> > > > > >> > >> > application
> > > > > >> > >> > >> > will
> > > > > >> > >> > >> > > > > > > generally
> > > > > >> > >> > >> > > > > > > > > have a stable access pattern and can
> > > figure
> > > > > out
> > > > > >> > >> > >> empirically
> > > > > >> > >> > >> > how
> > > > > >> > >> > >> > > > > many
> > > > > >> > >> > >> > > > > > > > > "request thread units" it needs to
> meet
> > > it's
> > > > > >> > >> > >> > throughput/latency
> > > > > >> > >> > >> > > > > > goals.
> > > > > >> > >> > >> > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > Cheers,
> > > > > >> > >> > >> > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > Roger
> > > > > >> > >> > >> > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun
> > Rao <
> > > > > >> > >> > >> jun@confluent.io>
> > > > > >> > >> > >> > > > wrote:
> > > > > >> > >> > >> > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > > >> > >> > >> > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > Thanks for the updated KIP. A few
> more
> > > > > >> comments.
> > > > > >> > >> > >> > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > 1. A concern of request_time_percent
> > is
> > > > that
> > > > > >> it's
> > > > > >> > >> not
> > > > > >> > >> > an
> > > > > >> > >> > >> > > > absolute
> > > > > >> > >> > >> > > > > > > > value.
> > > > > >> > >> > >> > > > > > > > > > Let's say you give a user a 10%
> limit.
> > > If
> > > > > the
> > > > > >> > admin
> > > > > >> > >> > >> doubles
> > > > > >> > >> > >> > > the
> > > > > >> > >> > >> > > > > > > number
> > > > > >> > >> > >> > > > > > > > of
> > > > > >> > >> > >> > > > > > > > > > request handler threads, that user
> now
> > > > > >> actually
> > > > > >> > has
> > > > > >> > >> > >> twice
> > > > > >> > >> > >> > the
> > > > > >> > >> > >> > > > > > > absolute
> > > > > >> > >> > >> > > > > > > > > > capacity. This may confuse people a
> > bit.
> > > > So,
> > > > > >> > >> perhaps
> > > > > >> > >> > >> > setting
> > > > > >> > >> > >> > > > the
> > > > > >> > >> > >> > > > > > > quota
> > > > > >> > >> > >> > > > > > > > > > based on an absolute request thread
> > unit
> > > > is
> > > > > >> > better.
> > > > > >> > >> > >> > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also
> > an
> > > > > >> > >> inter-broker
> > > > > >> > >> > >> > request
> > > > > >> > >> > >> > > > and
> > > > > >> > >> > >> > > > > > > needs
> > > > > >> > >> > >> > > > > > > > to
> > > > > >> > >> > >> > > > > > > > > > be excluded from throttling.
> > > > > >> > >> > >> > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > 3. Implementation wise, I am
> wondering
> > > if
> > > > > it's
> > > > > >> > >> simpler
> > > > > >> > >> > >> to
> > > > > >> > >> > >> > > apply
> > > > > >> > >> > >> > > > > the
> > > > > >> > >> > >> > > > > > > > > request
> > > > > >> > >> > >> > > > > > > > > > time throttling first in
> > > > KafkaApis.handle().
> > > > > >> > >> > Otherwise,
> > > > > >> > >> > >> we
> > > > > >> > >> > >> > > will
> > > > > >> > >> > >> > > > > > need
> > > > > >> > >> > >> > > > > > > to
> > > > > >> > >> > >> > > > > > > > > add
> > > > > >> > >> > >> > > > > > > > > > the throttling logic in each type of
> > > > > request.
> > > > > >> > >> > >> > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > > >> > >> > >> > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > Jun
> > > > > >> > >> > >> > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM,
> > Rajini
> > > > > >> Sivaram <
> > > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > > >> > >> > >> > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > wrote:
> > > > > >> > >> > >> > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > > >> > >> > >> > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > Thank you for the review.
> > > > > >> > >> > >> > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > I have reverted to the original
> KIP
> > > that
> > > > > >> > >> throttles
> > > > > >> > >> > >> based
> > > > > >> > >> > >> > on
> > > > > >> > >> > >> > > > > > request
> > > > > >> > >> > >> > > > > > > > > > handler
> > > > > >> > >> > >> > > > > > > > > > > utilization. At the moment, it
> uses
> > > > > >> percentage,
> > > > > >> > >> but
> > > > > >> > >> > I
> > > > > >> > >> > >> am
> > > > > >> > >> > >> > > > happy
> > > > > >> > >> > >> > > > > to
> > > > > >> > >> > >> > > > > > > > > change
> > > > > >> > >> > >> > > > > > > > > > to
> > > > > >> > >> > >> > > > > > > > > > > a fraction (out of 1 instead of
> 100)
> > > if
> > > > > >> > >> required. I
> > > > > >> > >> > >> have
> > > > > >> > >> > >> > > > added
> > > > > >> > >> > >> > > > > > the
> > > > > >> > >> > >> > > > > > > > > > examples
> > > > > >> > >> > >> > > > > > > > > > > from this discussion to the KIP.
> > Also
> > > > > added
> > > > > >> a
> > > > > >> > >> > "Future
> > > > > >> > >> > >> > Work"
> > > > > >> > >> > >> > > > > > section
> > > > > >> > >> > >> > > > > > > > to
> > > > > >> > >> > >> > > > > > > > > > > address network thread
> utilization.
> > > The
> > > > > >> > >> > configuration
> > > > > >> > >> > >> is
> > > > > >> > >> > >> > > > named
> > > > > >> > >> > >> > > > > > > > > > > "request_time_percent" with the
> > > > > expectation
> > > > > >> > that
> > > > > >> > >> it
> > > > > >> > >> > >> can
> > > > > >> > >> > >> > > also
> > > > > >> > >> > >> > > > be
> > > > > >> > >> > >> > > > > > > used
> > > > > >> > >> > >> > > > > > > > as
> > > > > >> > >> > >> > > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > limit for network thread
> utilization
> > > > when
> > > > > >> that
> > > > > >> > is
> > > > > >> > >> > >> > > > implemented,
> > > > > >> > >> > >> > > > > so
> > > > > >> > >> > >> > > > > > > > that
> > > > > >> > >> > >> > > > > > > > > > > users have to set only one config
> > for
> > > > the
> > > > > >> two
> > > > > >> > and
> > > > > >> > >> > not
> > > > > >> > >> > >> > have
> > > > > >> > >> > >> > > to
> > > > > >> > >> > >> > > > > > worry
> > > > > >> > >> > >> > > > > > > > > about
> > > > > >> > >> > >> > > > > > > > > > > the internal distribution of the
> > work
> > > > > >> between
> > > > > >> > the
> > > > > >> > >> > two
> > > > > >> > >> > >> > > thread
> > > > > >> > >> > >> > > > > > pools
> > > > > >> > >> > >> > > > > > > in
> > > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > > >> > >> > >> > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > > >> > >> > >> > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > > >> > >> > >> > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM,
> > Jun
> > > > Rao
> > > > > <
> > > > > >> > >> > >> > > jun@confluent.io>
> > > > > >> > >> > >> > > > > > > wrote:
> > > > > >> > >> > >> > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > The benefit of using the request
> > > > > >> processing
> > > > > >> > >> time
> > > > > >> > >> > >> over
> > > > > >> > >> > >> > the
> > > > > >> > >> > >> > > > > > request
> > > > > >> > >> > >> > > > > > > > > rate
> > > > > >> > >> > >> > > > > > > > > > is
> > > > > >> > >> > >> > > > > > > > > > > > exactly what people have said. I
> > > will
> > > > > just
> > > > > >> > >> expand
> > > > > >> > >> > >> that
> > > > > >> > >> > >> > a
> > > > > >> > >> > >> > > > bit.
> > > > > >> > >> > >> > > > > > > > > Consider
> > > > > >> > >> > >> > > > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > > following case. The producer
> > sends a
> > > > > >> produce
> > > > > >> > >> > request
> > > > > >> > >> > >> > > with a
> > > > > >> > >> > >> > > > > > 10MB
> > > > > >> > >> > >> > > > > > > > > > message
> > > > > >> > >> > >> > > > > > > > > > > > but compressed to 100KB with
> gzip.
> > > The
> > > > > >> > >> > >> decompression of
> > > > > >> > >> > >> > > the
> > > > > >> > >> > >> > > > > > > message
> > > > > >> > >> > >> > > > > > > > > on
> > > > > >> > >> > >> > > > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > > broker could take 10-15 seconds,
> > > > during
> > > > > >> which
> > > > > >> > >> > time,
> > > > > >> > >> > >> a
> > > > > >> > >> > >> > > > request
> > > > > >> > >> > >> > > > > > > > handler
> > > > > >> > >> > >> > > > > > > > > > > > thread is completely blocked. In
> > > this
> > > > > >> case,
> > > > > >> > >> > neither
> > > > > >> > >> > >> the
> > > > > >> > >> > >> > > > > byte-in
> > > > > >> > >> > >> > > > > > > > quota
> > > > > >> > >> > >> > > > > > > > > > nor
> > > > > >> > >> > >> > > > > > > > > > > > the request rate quota may be
> > > > effective
> > > > > in
> > > > > >> > >> > >> protecting
> > > > > >> > >> > >> > the
> > > > > >> > >> > >> > > > > > broker.
> > > > > >> > >> > >> > > > > > > > > > > Consider
> > > > > >> > >> > >> > > > > > > > > > > > another case. A consumer group
> > > starts
> > > > > >> with 10
> > > > > >> > >> > >> instances
> > > > > >> > >> > >> > > and
> > > > > >> > >> > >> > > > > > later
> > > > > >> > >> > >> > > > > > > > on
> > > > > >> > >> > >> > > > > > > > > > > > switches to 20 instances. The
> > > request
> > > > > rate
> > > > > >> > will
> > > > > >> > >> > >> likely
> > > > > >> > >> > >> > > > > double,
> > > > > >> > >> > >> > > > > > > but
> > > > > >> > >> > >> > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > > actually load on the broker may
> > not
> > > > > double
> > > > > >> > >> since
> > > > > >> > >> > >> each
> > > > > >> > >> > >> > > fetch
> > > > > >> > >> > >> > > > > > > request
> > > > > >> > >> > >> > > > > > > > > > only
> > > > > >> > >> > >> > > > > > > > > > > > contains half of the partitions.
> > > > Request
> > > > > >> rate
> > > > > >> > >> > quota
> > > > > >> > >> > >> may
> > > > > >> > >> > >> > > not
> > > > > >> > >> > >> > > > > be
> > > > > >> > >> > >> > > > > > > easy
> > > > > >> > >> > >> > > > > > > > > to
> > > > > >> > >> > >> > > > > > > > > > > > configure in this case.
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > What we really want is to be
> able
> > to
> > > > > >> prevent
> > > > > >> > a
> > > > > >> > >> > >> client
> > > > > >> > >> > >> > > from
> > > > > >> > >> > >> > > > > > using
> > > > > >> > >> > >> > > > > > > > too
> > > > > >> > >> > >> > > > > > > > > > much
> > > > > >> > >> > >> > > > > > > > > > > > of the server side resources. In
> > > this
> > > > > >> > >> particular
> > > > > >> > >> > >> KIP,
> > > > > >> > >> > >> > > this
> > > > > >> > >> > >> > > > > > > resource
> > > > > >> > >> > >> > > > > > > > > is
> > > > > >> > >> > >> > > > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > > capacity of the request handler
> > > > > threads. I
> > > > > >> > >> agree
> > > > > >> > >> > >> that
> > > > > >> > >> > >> > it
> > > > > >> > >> > >> > > > may
> > > > > >> > >> > >> > > > > > not
> > > > > >> > >> > >> > > > > > > be
> > > > > >> > >> > >> > > > > > > > > > > > intuitive for the users to
> > determine
> > > > how
> > > > > >> to
> > > > > >> > set
> > > > > >> > >> > the
> > > > > >> > >> > >> > right
> > > > > >> > >> > >> > > > > > limit.
> > > > > >> > >> > >> > > > > > > > > > However,
> > > > > >> > >> > >> > > > > > > > > > > > this is not completely new and
> has
> > > > been
> > > > > >> done
> > > > > >> > in
> > > > > >> > >> > the
> > > > > >> > >> > >> > > > container
> > > > > >> > >> > >> > > > > > > world
> > > > > >> > >> > >> > > > > > > > > > > > already. For example, Linux
> > cgroup (
> > > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > > >> > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > > >> > >> > >> terprise_Linux/6/html/
> > > > > >> > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-
> > > > cpu.html)
> > > > > >> has
> > > > > >> > >> the
> > > > > >> > >> > >> > concept
> > > > > >> > >> > >> > > of
> > > > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > >> > >> > >> > > > > > > > > > > > which specifies the total amount
> > of
> > > > time
> > > > > >> in
> > > > > >> > >> > >> > microseconds
> > > > > >> > >> > >> > > > for
> > > > > >> > >> > >> > > > > > > which
> > > > > >> > >> > >> > > > > > > > > all
> > > > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup can run
> during a
> > > one
> > > > > >> second
> > > > > >> > >> > >> period.
> > > > > >> > >> > >> > We
> > > > > >> > >> > >> > > > can
> > > > > >> > >> > >> > > > > > > > > > potentially
> > > > > >> > >> > >> > > > > > > > > > > > model the request handler
> threads
> > > in a
> > > > > >> > similar
> > > > > >> > >> > way.
> > > > > >> > >> > >> For
> > > > > >> > >> > >> > > > > > example,
> > > > > >> > >> > >> > > > > > > > each
> > > > > >> > >> > >> > > > > > > > > > > > request handler thread can be 1
> > > > request
> > > > > >> > handler
> > > > > >> > >> > unit
> > > > > >> > >> > >> > and
> > > > > >> > >> > >> > > > the
> > > > > >> > >> > >> > > > > > > admin
> > > > > >> > >> > >> > > > > > > > > can
> > > > > >> > >> > >> > > > > > > > > > > > configure a limit on how many
> > units
> > > > (say
> > > > > >> > 0.01)
> > > > > >> > >> a
> > > > > >> > >> > >> client
> > > > > >> > >> > >> > > can
> > > > > >> > >> > >> > > > > > have.
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > Regarding not throttling the
> > > internal
> > > > > >> broker
> > > > > >> > to
> > > > > >> > >> > >> broker
> > > > > >> > >> > >> > > > > > requests.
> > > > > >> > >> > >> > > > > > > We
> > > > > >> > >> > >> > > > > > > > > > could
> > > > > >> > >> > >> > > > > > > > > > > > do that. Alternatively, we could
> > > just
> > > > > let
> > > > > >> the
> > > > > >> > >> > admin
> > > > > >> > >> > >> > > > > configure a
> > > > > >> > >> > >> > > > > > > > high
> > > > > >> > >> > >> > > > > > > > > > > limit
> > > > > >> > >> > >> > > > > > > > > > > > for the kafka user (it may not
> be
> > > able
> > > > > to
> > > > > >> do
> > > > > >> > >> that
> > > > > >> > >> > >> > easily
> > > > > >> > >> > >> > > > > based
> > > > > >> > >> > >> > > > > > on
> > > > > >> > >> > >> > > > > > > > > > > clientId
> > > > > >> > >> > >> > > > > > > > > > > > though).
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > Ideally we want to be able to
> > > protect
> > > > > the
> > > > > >> > >> > >> utilization
> > > > > >> > >> > >> > of
> > > > > >> > >> > >> > > > the
> > > > > >> > >> > >> > > > > > > > network
> > > > > >> > >> > >> > > > > > > > > > > thread
> > > > > >> > >> > >> > > > > > > > > > > > pool too. The difficult is
> mostly
> > > what
> > > > > >> Rajini
> > > > > >> > >> > said:
> > > > > >> > >> > >> (1)
> > > > > >> > >> > >> > > The
> > > > > >> > >> > >> > > > > > > > mechanism
> > > > > >> > >> > >> > > > > > > > > > for
> > > > > >> > >> > >> > > > > > > > > > > > throttling the requests is
> through
> > > > > >> Purgatory
> > > > > >> > >> and
> > > > > >> > >> > we
> > > > > >> > >> > >> > will
> > > > > >> > >> > >> > > > have
> > > > > >> > >> > >> > > > > > to
> > > > > >> > >> > >> > > > > > > > > think
> > > > > >> > >> > >> > > > > > > > > > > > through how to integrate that
> into
> > > the
> > > > > >> > network
> > > > > >> > >> > >> layer.
> > > > > >> > >> > >> > > (2)
> > > > > >> > >> > >> > > > In
> > > > > >> > >> > >> > > > > > the
> > > > > >> > >> > >> > > > > > > > > > network
> > > > > >> > >> > >> > > > > > > > > > > > layer, currently we know the
> user,
> > > but
> > > > > not
> > > > > >> > the
> > > > > >> > >> > >> clientId
> > > > > >> > >> > >> > > of
> > > > > >> > >> > >> > > > > the
> > > > > >> > >> > >> > > > > > > > > request.
> > > > > >> > >> > >> > > > > > > > > > > So,
> > > > > >> > >> > >> > > > > > > > > > > > it's a bit tricky to throttle
> > based
> > > on
> > > > > >> > clientId
> > > > > >> > >> > >> there.
> > > > > >> > >> > >> > > > Plus,
> > > > > >> > >> > >> > > > > > the
> > > > > >> > >> > >> > > > > > > > > > byteOut
> > > > > >> > >> > >> > > > > > > > > > > > quota can already protect the
> > > network
> > > > > >> thread
> > > > > >> > >> > >> > utilization
> > > > > >> > >> > >> > > > for
> > > > > >> > >> > >> > > > > > > fetch
> > > > > >> > >> > >> > > > > > > > > > > > requests. So, if we can't figure
> > out
> > > > > this
> > > > > >> > part
> > > > > >> > >> > right
> > > > > >> > >> > >> > now,
> > > > > >> > >> > >> > > > > just
> > > > > >> > >> > >> > > > > > > > > focusing
> > > > > >> > >> > >> > > > > > > > > > > on
> > > > > >> > >> > >> > > > > > > > > > > > the request handling threads for
> > > this
> > > > > KIP
> > > > > >> is
> > > > > >> > >> > still a
> > > > > >> > >> > >> > > useful
> > > > > >> > >> > >> > > > > > > > feature.
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM,
> > > > Rajini
> > > > > >> > >> Sivaram <
> > > > > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > >> > >> > >> > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > Thank you all for the
> feedback.
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > Jay: I have removed exemption
> > for
> > > > > >> consumer
> > > > > >> > >> > >> heartbeat
> > > > > >> > >> > >> > > etc.
> > > > > >> > >> > >> > > > > > Agree
> > > > > >> > >> > >> > > > > > > > > that
> > > > > >> > >> > >> > > > > > > > > > > > > protecting the cluster is more
> > > > > important
> > > > > >> > than
> > > > > >> > >> > >> > > protecting
> > > > > >> > >> > >> > > > > > > > individual
> > > > > >> > >> > >> > > > > > > > > > > apps.
> > > > > >> > >> > >> > > > > > > > > > > > > Have retained the exemption
> for
> > > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > > >> > >> > >> > > > > > etc,
> > > > > >> > >> > >> > > > > > > > > these
> > > > > >> > >> > >> > > > > > > > > > > are
> > > > > >> > >> > >> > > > > > > > > > > > > throttled only if
> authorization
> > > > fails
> > > > > >> (so
> > > > > >> > >> can't
> > > > > >> > >> > be
> > > > > >> > >> > >> > used
> > > > > >> > >> > >> > > > for
> > > > > >> > >> > >> > > > > > DoS
> > > > > >> > >> > >> > > > > > > > > > attacks
> > > > > >> > >> > >> > > > > > > > > > > > in
> > > > > >> > >> > >> > > > > > > > > > > > > a secure cluster, but allows
> > > > > >> inter-broker
> > > > > >> > >> > >> requests to
> > > > > >> > >> > >> > > > > > complete
> > > > > >> > >> > >> > > > > > > > > > without
> > > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > I will wait another day to see
> > if
> > > > > these
> > > > > >> is
> > > > > >> > >> any
> > > > > >> > >> > >> > > objection
> > > > > >> > >> > >> > > > to
> > > > > >> > >> > >> > > > > > > > quotas
> > > > > >> > >> > >> > > > > > > > > > > based
> > > > > >> > >> > >> > > > > > > > > > > > on
> > > > > >> > >> > >> > > > > > > > > > > > > request processing time (as
> > > opposed
> > > > to
> > > > > >> > >> request
> > > > > >> > >> > >> rate)
> > > > > >> > >> > >> > > and
> > > > > >> > >> > >> > > > if
> > > > > >> > >> > >> > > > > > > there
> > > > > >> > >> > >> > > > > > > > > are
> > > > > >> > >> > >> > > > > > > > > > > no
> > > > > >> > >> > >> > > > > > > > > > > > > objections, I will revert to
> the
> > > > > >> original
> > > > > >> > >> > proposal
> > > > > >> > >> > >> > with
> > > > > >> > >> > >> > > > > some
> > > > > >> > >> > >> > > > > > > > > changes.
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > The original proposal was only
> > > > > including
> > > > > >> > the
> > > > > >> > >> > time
> > > > > >> > >> > >> > used
> > > > > >> > >> > >> > > by
> > > > > >> > >> > >> > > > > the
> > > > > >> > >> > >> > > > > > > > > request
> > > > > >> > >> > >> > > > > > > > > > > > > handler threads (that made
> > > > calculation
> > > > > >> > >> easy). I
> > > > > >> > >> > >> think
> > > > > >> > >> > >> > > the
> > > > > >> > >> > >> > > > > > > > > suggestion
> > > > > >> > >> > >> > > > > > > > > > is
> > > > > >> > >> > >> > > > > > > > > > > > to
> > > > > >> > >> > >> > > > > > > > > > > > > include the time spent in the
> > > > network
> > > > > >> > >> threads as
> > > > > >> > >> > >> well
> > > > > >> > >> > >> > > > since
> > > > > >> > >> > >> > > > > > > that
> > > > > >> > >> > >> > > > > > > > > may
> > > > > >> > >> > >> > > > > > > > > > be
> > > > > >> > >> > >> > > > > > > > > > > > > significant. As Jay pointed
> out,
> > > it
> > > > is
> > > > > >> more
> > > > > >> > >> > >> > complicated
> > > > > >> > >> > >> > > > to
> > > > > >> > >> > >> > > > > > > > > calculate
> > > > > >> > >> > >> > > > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > > > total available CPU time and
> > > convert
> > > > > to
> > > > > >> a
> > > > > >> > >> ratio
> > > > > >> > >> > >> when
> > > > > >> > >> > >> > > > there
> > > > > >> > >> > >> > > > > > *m*
> > > > > >> > >> > >> > > > > > > > I/O
> > > > > >> > >> > >> > > > > > > > > > > > threads
> > > > > >> > >> > >> > > > > > > > > > > > > and *n* network threads.
> > > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > > >> > >> > >> > > )
> > > > > >> > >> > >> > > > > may
> > > > > >> > >> > >> > > > > > > > give
> > > > > >> > >> > >> > > > > > > > > us
> > > > > >> > >> > >> > > > > > > > > > > > what
> > > > > >> > >> > >> > > > > > > > > > > > > we want, but it can be very
> > > > expensive
> > > > > on
> > > > > >> > some
> > > > > >> > >> > >> > > platforms.
> > > > > >> > >> > >> > > > As
> > > > > >> > >> > >> > > > > > > > Becket
> > > > > >> > >> > >> > > > > > > > > > and
> > > > > >> > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we
> do
> > > > have
> > > > > >> > several
> > > > > >> > >> > time
> > > > > >> > >> > >> > > > > > measurements
> > > > > >> > >> > >> > > > > > > > > > already
> > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > >> > >> > >> > > > > > > > > > > > > generating metrics that we
> could
> > > > use,
> > > > > >> > though
> > > > > >> > >> we
> > > > > >> > >> > >> might
> > > > > >> > >> > >> > > > want
> > > > > >> > >> > >> > > > > to
> > > > > >> > >> > >> > > > > > > > > switch
> > > > > >> > >> > >> > > > > > > > > > to
> > > > > >> > >> > >> > > > > > > > > > > > > nanoTime() instead of
> > > > > >> currentTimeMillis()
> > > > > >> > >> since
> > > > > >> > >> > >> some
> > > > > >> > >> > >> > of
> > > > > >> > >> > >> > > > the
> > > > > >> > >> > >> > > > > > > > values
> > > > > >> > >> > >> > > > > > > > > > for
> > > > > >> > >> > >> > > > > > > > > > > > > small requests may be < 1ms.
> But
> > > > > rather
> > > > > >> > than
> > > > > >> > >> add
> > > > > >> > >> > >> up
> > > > > >> > >> > >> > the
> > > > > >> > >> > >> > > > > time
> > > > > >> > >> > >> > > > > > > > spent
> > > > > >> > >> > >> > > > > > > > > in
> > > > > >> > >> > >> > > > > > > > > > > I/O
> > > > > >> > >> > >> > > > > > > > > > > > > thread and network thread,
> > > wouldn't
> > > > it
> > > > > >> be
> > > > > >> > >> better
> > > > > >> > >> > >> to
> > > > > >> > >> > >> > > > convert
> > > > > >> > >> > >> > > > > > the
> > > > > >> > >> > >> > > > > > > > > time
> > > > > >> > >> > >> > > > > > > > > > > > spent
> > > > > >> > >> > >> > > > > > > > > > > > > on each thread into a separate
> > > > ratio?
> > > > > >> UserA
> > > > > >> > >> has
> > > > > >> > >> > a
> > > > > >> > >> > >> > > request
> > > > > >> > >> > >> > > > > > quota
> > > > > >> > >> > >> > > > > > > > of
> > > > > >> > >> > >> > > > > > > > > > 5%.
> > > > > >> > >> > >> > > > > > > > > > > > Can
> > > > > >> > >> > >> > > > > > > > > > > > > we take that to mean that
> UserA
> > > can
> > > > > use
> > > > > >> 5%
> > > > > >> > of
> > > > > >> > >> > the
> > > > > >> > >> > >> > time
> > > > > >> > >> > >> > > on
> > > > > >> > >> > >> > > > > > > network
> > > > > >> > >> > >> > > > > > > > > > > threads
> > > > > >> > >> > >> > > > > > > > > > > > > and 5% of the time on I/O
> > threads?
> > > > If
> > > > > >> > either
> > > > > >> > >> is
> > > > > >> > >> > >> > > exceeded,
> > > > > >> > >> > >> > > > > the
> > > > > >> > >> > >> > > > > > > > > > response
> > > > > >> > >> > >> > > > > > > > > > > is
> > > > > >> > >> > >> > > > > > > > > > > > > throttled - it would mean
> > > > maintaining
> > > > > >> two
> > > > > >> > >> sets
> > > > > >> > >> > of
> > > > > >> > >> > >> > > metrics
> > > > > >> > >> > >> > > > > for
> > > > > >> > >> > >> > > > > > > the
> > > > > >> > >> > >> > > > > > > > > two
> > > > > >> > >> > >> > > > > > > > > > > > > durations, but would result in
> > > more
> > > > > >> > >> meaningful
> > > > > >> > >> > >> > ratios.
> > > > > >> > >> > >> > > We
> > > > > >> > >> > >> > > > > > could
> > > > > >> > >> > >> > > > > > > > > > define
> > > > > >> > >> > >> > > > > > > > > > > > two
> > > > > >> > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of
> > > > request
> > > > > >> > threads
> > > > > >> > >> > and
> > > > > >> > >> > >> 10%
> > > > > >> > >> > >> > > of
> > > > > >> > >> > >> > > > > > > network
> > > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > > >> > >> > >> > > > > > > > > > > > > but that seems unnecessary and
> > > > harder
> > > > > to
> > > > > >> > >> explain
> > > > > >> > >> > >> to
> > > > > >> > >> > >> > > > users.
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > Back to why and how quotas are
> > > > applied
> > > > > >> to
> > > > > >> > >> > network
> > > > > >> > >> > >> > > thread
> > > > > >> > >> > >> > > > > > > > > utilization:
> > > > > >> > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the
> > time
> > > > > >> spent in
> > > > > >> > >> the
> > > > > >> > >> > >> > network
> > > > > >> > >> > >> > > > > > thread
> > > > > >> > >> > >> > > > > > > > may
> > > > > >> > >> > >> > > > > > > > > be
> > > > > >> > >> > >> > > > > > > > > > > > > significant and I can see the
> > need
> > > > to
> > > > > >> > include
> > > > > >> > >> > >> this.
> > > > > >> > >> > >> > Are
> > > > > >> > >> > >> > > > > there
> > > > > >> > >> > >> > > > > > > > other
> > > > > >> > >> > >> > > > > > > > > > > > > requests where the network
> > thread
> > > > > >> > >> utilization is
> > > > > >> > >> > >> > > > > significant?
> > > > > >> > >> > >> > > > > > > In
> > > > > >> > >> > >> > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > case
> > > > > >> > >> > >> > > > > > > > > > > > > of fetch, request handler
> thread
> > > > > >> > utilization
> > > > > >> > >> > would
> > > > > >> > >> > >> > > > throttle
> > > > > >> > >> > >> > > > > > > > clients
> > > > > >> > >> > >> > > > > > > > > > > with
> > > > > >> > >> > >> > > > > > > > > > > > > high request rate, low data
> > volume
> > > > and
> > > > > >> > fetch
> > > > > >> > >> > byte
> > > > > >> > >> > >> > rate
> > > > > >> > >> > >> > > > > quota
> > > > > >> > >> > >> > > > > > > will
> > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > >> > >> > >> > > > > > > > > > > > > clients with high data volume.
> > > > Network
> > > > > >> > thread
> > > > > >> > >> > >> > > utilization
> > > > > >> > >> > >> > > > > is
> > > > > >> > >> > >> > > > > > > > > perhaps
> > > > > >> > >> > >> > > > > > > > > > > > > proportional to the data
> > volume. I
> > > > am
> > > > > >> > >> wondering
> > > > > >> > >> > >> if we
> > > > > >> > >> > >> > > > even
> > > > > >> > >> > >> > > > > > need
> > > > > >> > >> > >> > > > > > > > to
> > > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > > >> > >> > >> > > > > > > > > > > > > based on network thread
> > > utilization
> > > > or
> > > > > >> > >> whether
> > > > > >> > >> > the
> > > > > >> > >> > >> > data
> > > > > >> > >> > >> > > > > > volume
> > > > > >> > >> > >> > > > > > > > > quota
> > > > > >> > >> > >> > > > > > > > > > > > covers
> > > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > b) At the moment, we record
> and
> > > > check
> > > > > >> for
> > > > > >> > >> quota
> > > > > >> > >> > >> > > violation
> > > > > >> > >> > >> > > > > at
> > > > > >> > >> > >> > > > > > > the
> > > > > >> > >> > >> > > > > > > > > same
> > > > > >> > >> > >> > > > > > > > > > > > time.
> > > > > >> > >> > >> > > > > > > > > > > > > If a quota is violated, the
> > > response
> > > > > is
> > > > > >> > >> delayed.
> > > > > >> > >> > >> > Using
> > > > > >> > >> > >> > > > > Jay'e
> > > > > >> > >> > >> > > > > > > > > example
> > > > > >> > >> > >> > > > > > > > > > of
> > > > > >> > >> > >> > > > > > > > > > > > > disk reads for fetches
> happening
> > > in
> > > > > the
> > > > > >> > >> network
> > > > > >> > >> > >> > thread,
> > > > > >> > >> > >> > > > We
> > > > > >> > >> > >> > > > > > > can't
> > > > > >> > >> > >> > > > > > > > > > record
> > > > > >> > >> > >> > > > > > > > > > > > and
> > > > > >> > >> > >> > > > > > > > > > > > > delay a response after the
> disk
> > > > reads.
> > > > > >> We
> > > > > >> > >> could
> > > > > >> > >> > >> > record
> > > > > >> > >> > >> > > > the
> > > > > >> > >> > >> > > > > > time
> > > > > >> > >> > >> > > > > > > > > spent
> > > > > >> > >> > >> > > > > > > > > > > on
> > > > > >> > >> > >> > > > > > > > > > > > > the network thread when the
> > > response
> > > > > is
> > > > > >> > >> complete
> > > > > >> > >> > >> and
> > > > > >> > >> > >> > > > > > introduce
> > > > > >> > >> > >> > > > > > > a
> > > > > >> > >> > >> > > > > > > > > > delay
> > > > > >> > >> > >> > > > > > > > > > > > for
> > > > > >> > >> > >> > > > > > > > > > > > > handling a subsequent request
> > > > > (separate
> > > > > >> out
> > > > > >> > >> > >> recording
> > > > > >> > >> > >> > > and
> > > > > >> > >> > >> > > > > > quota
> > > > > >> > >> > >> > > > > > > > > > > violation
> > > > > >> > >> > >> > > > > > > > > > > > > handling in the case of
> network
> > > > thread
> > > > > >> > >> > overload).
> > > > > >> > >> > >> > Does
> > > > > >> > >> > >> > > > that
> > > > > >> > >> > >> > > > > > > make
> > > > > >> > >> > >> > > > > > > > > > sense?
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58
> AM,
> > > > > Becket
> > > > > >> > Qin <
> > > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > > >> > >> > >> > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing
> > the
> > > > CPU
> > > > > >> time
> > > > > >> > >> is a
> > > > > >> > >> > >> > little
> > > > > >> > >> > >> > > > > > > tricky. I
> > > > > >> > >> > >> > > > > > > > > am
> > > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > > >> > >> > >> > > > > > > > > > > > > > that maybe we can use the
> > > existing
> > > > > >> > request
> > > > > >> > >> > >> > > statistics.
> > > > > >> > >> > >> > > > > They
> > > > > >> > >> > >> > > > > > > are
> > > > > >> > >> > >> > > > > > > > > > > already
> > > > > >> > >> > >> > > > > > > > > > > > > > very detailed so we can
> > probably
> > > > see
> > > > > >> the
> > > > > >> > >> > >> > approximate
> > > > > >> > >> > >> > > > CPU
> > > > > >> > >> > >> > > > > > time
> > > > > >> > >> > >> > > > > > > > > from
> > > > > >> > >> > >> > > > > > > > > > > it,
> > > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > > >> > >> > >> > > > > > > > > > > > > > something like (total_time -
> > > > > >> > >> > >> > > > request/response_queue_time
> > > > > >> > >> > >> > > > > -
> > > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > > I agree with Guozhang that
> > when
> > > a
> > > > > >> user is
> > > > > >> > >> > >> throttled
> > > > > >> > >> > >> > > it
> > > > > >> > >> > >> > > > is
> > > > > >> > >> > >> > > > > > > > likely
> > > > > >> > >> > >> > > > > > > > > > that
> > > > > >> > >> > >> > > > > > > > > > > > we
> > > > > >> > >> > >> > > > > > > > > > > > > > need to see if anything has
> > went
> > > > > wrong
> > > > > >> > >> first,
> > > > > >> > >> > >> and
> > > > > >> > >> > >> > if
> > > > > >> > >> > >> > > > the
> > > > > >> > >> > >> > > > > > > users
> > > > > >> > >> > >> > > > > > > > > are
> > > > > >> > >> > >> > > > > > > > > > > well
> > > > > >> > >> > >> > > > > > > > > > > > > > behaving and just need more
> > > > > >> resources, we
> > > > > >> > >> will
> > > > > >> > >> > >> have
> > > > > >> > >> > >> > > to
> > > > > >> > >> > >> > > > > bump
> > > > > >> > >> > >> > > > > > > up
> > > > > >> > >> > >> > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > > quota
> > > > > >> > >> > >> > > > > > > > > > > > > > for them. It is true that
> > > > > >> pre-allocating
> > > > > >> > >> CPU
> > > > > >> > >> > >> time
> > > > > >> > >> > >> > > quota
> > > > > >> > >> > >> > > > > > > > precisely
> > > > > >> > >> > >> > > > > > > > > > for
> > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > > > > users is difficult. So in
> > > practice
> > > > > it
> > > > > >> > would
> > > > > >> > >> > >> > probably
> > > > > >> > >> > >> > > be
> > > > > >> > >> > >> > > > > > more
> > > > > >> > >> > >> > > > > > > > like
> > > > > >> > >> > >> > > > > > > > > > > first
> > > > > >> > >> > >> > > > > > > > > > > > > set
> > > > > >> > >> > >> > > > > > > > > > > > > > a relative high protective
> CPU
> > > > time
> > > > > >> quota
> > > > > >> > >> for
> > > > > >> > >> > >> > > everyone
> > > > > >> > >> > >> > > > > and
> > > > > >> > >> > >> > > > > > > > > increase
> > > > > >> > >> > >> > > > > > > > > > > > that
> > > > > >> > >> > >> > > > > > > > > > > > > > for some individual clients
> on
> > > > > demand.
> > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48
> > PM,
> > > > > >> Guozhang
> > > > > >> > >> > Wang <
> > > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > > >> > >> > >> > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > > >> > >> > >> > > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > > > This is a great proposal,
> > glad
> > > > to
> > > > > >> see
> > > > > >> > it
> > > > > >> > >> > >> > happening.
> > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU
> > > > > >> throttling, or
> > > > > >> > >> more
> > > > > >> > >> > >> > > > > specifically
> > > > > >> > >> > >> > > > > > > > > > > processing
> > > > > >> > >> > >> > > > > > > > > > > > > time
> > > > > >> > >> > >> > > > > > > > > > > > > > > ratio instead of the
> request
> > > > rate
> > > > > >> > >> throttling
> > > > > >> > >> > >> as
> > > > > >> > >> > >> > > well.
> > > > > >> > >> > >> > > > > > > Becket
> > > > > >> > >> > >> > > > > > > > > has
> > > > > >> > >> > >> > > > > > > > > > > very
> > > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > > >> > >> > >> > > > > > > > > > > > > > > summed my rationales
> above,
> > > and
> > > > > one
> > > > > >> > >> thing to
> > > > > >> > >> > >> add
> > > > > >> > >> > >> > > here
> > > > > >> > >> > >> > > > > is
> > > > > >> > >> > >> > > > > > > that
> > > > > >> > >> > >> > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > > > former
> > > > > >> > >> > >> > > > > > > > > > > > > > > has a good support for
> both
> > > > > >> "protecting
> > > > > >> > >> > >> against
> > > > > >> > >> > >> > > rogue
> > > > > >> > >> > >> > > > > > > > clients"
> > > > > >> > >> > >> > > > > > > > > as
> > > > > >> > >> > >> > > > > > > > > > > > well
> > > > > >> > >> > >> > > > > > > > > > > > > as
> > > > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> > > > > >> multi-tenancy
> > > > > >> > >> > usage":
> > > > > >> > >> > >> > when
> > > > > >> > >> > >> > > > > > > thinking
> > > > > >> > >> > >> > > > > > > > > > about
> > > > > >> > >> > >> > > > > > > > > > > > how
> > > > > >> > >> > >> > > > > > > > > > > > > to
> > > > > >> > >> > >> > > > > > > > > > > > > > > explain this to the end
> > > users, I
> > > > > >> find
> > > > > >> > it
> > > > > >> > >> > >> actually
> > > > > >> > >> > >> > > > more
> > > > > >> > >> > >> > > > > > > > natural
> > > > > >> > >> > >> > > > > > > > > > than
> > > > > >> > >> > >> > > > > > > > > > > > the
> > > > > >> > >> > >> > > > > > > > > > > > > > > request rate since as
> > > mentioned
> > > > > >> above,
> > > > > >> > >> > >> different
> > > > > >> > >> > >> > > > > requests
> > > > > >> > >> > >> > > > > > > > will
> > > > > >> > >> > >> > > > > > > > > > have
> > > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > > >> > >> > >> > > > > > > > > > > > > > > different "cost", and
> Kafka
> > > > today
> > > > > >> > already
> > > > > >> > >> > have
> > > > > >> > >> > >> > > > various
> > > > > >> > >> > >> > > > > > > > request
> > > > > >> > >> > >> > > > > > > > > > > types
> > > > > >> > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin,
> > > > metadata,
> > > > > >> etc),
> > > > > >> > >> > >> because
> > > > > >> > >> > >> > of
> > > > > >> > >> > >> > > > that
> > > > > >> > >> > >> > > > > > the
> > > > > >> > >> > >> > > > > > > > > > request
> > > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > > >> > >> > >> > > > > > > > > > > > > > > throttling may not be as
> > > > effective
> > > > > >> > >> unless it
> > > > > >> > >> > >> is
> > > > > >> > >> > >> > set
> > > > > >> > >> > >> > > > > very
> > > > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > > >> > >> > >> > > > > > > > > > > > > > > Regarding to user
> reactions
> > > when
> > > > > >> they
> > > > > >> > are
> > > > > >> > >> > >> > > throttled,
> > > > > >> > >> > >> > > > I
> > > > > >> > >> > >> > > > > > > think
> > > > > >> > >> > >> > > > > > > > it
> > > > > >> > >> > >> > > > > > > > > > may
> > > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > > >> > >> > >> > > > > > > > > > > > > > > case-by-case, and need to
> be
> > > > > >> > discovered /
> > > > > >> > >> > >> guided
> > > > > >> > >> > >> > by
> > > > > >> > >> > >> > > > > > looking
> > > > > >> > >> > >> > > > > > > > at
> > > > > >> > >> > >> > > > > > > > > > > > relative
> > > > > >> > >> > >> > > > > > > > > > > > > > > metrics. So in other words
> > > users
> > > > > >> would
> > > > > >> > >> not
> > > > > >> > >> > >> expect
> > > > > >> > >> > >> > > to
> > > > > >> > >> > >> > > > > get
> > > > > >> > >> > >> > > > > > > > > > additional
> > > > > >> > >> > >> > > > > > > > > > > > > > > information by simply
> being
> > > told
> > > > > >> "hey,
> > > > > >> > >> you
> > > > > >> > >> > are
> > > > > >> > >> > >> > > > > > throttled",
> > > > > >> > >> > >> > > > > > > > > which
> > > > > >> > >> > >> > > > > > > > > > is
> > > > > >> > >> > >> > > > > > > > > > > > all
> > > > > >> > >> > >> > > > > > > > > > > > > > > what throttling does; they
> > > need
> > > > to
> > > > > >> > take a
> > > > > >> > >> > >> > follow-up
> > > > > >> > >> > >> > > > > step
> > > > > >> > >> > >> > > > > > > and
> > > > > >> > >> > >> > > > > > > > > see
> > > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > > >> > >> > >> > > > > > > > > > > > > > > throttled probably because
> > of
> > > > ..",
> > > > > >> > which
> > > > > >> > >> is
> > > > > >> > >> > by
> > > > > >> > >> > >> > > > looking
> > > > > >> > >> > >> > > > > at
> > > > > >> > >> > >> > > > > > > > other
> > > > > >> > >> > >> > > > > > > > > > > > metric
> > > > > >> > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm
> > > > > bombarding
> > > > > >> the
> > > > > >> > >> > >> brokers
> > > > > >> > >> > >> > > with
> > > > > >> > >> > >> > > > >
> > > > > >>
> > > > > > ...
> > > > > >
> > > > > > [Message clipped]
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Rajini,

Let's take your example. Let's say a user sets the limit to 50%. I am not
sure if it's better to apply the same percentage separately to network and
io thread pool. For example, for produce requests, most of the time will be
spent in the io threads whereas for fetch requests, most of the time will
be in the network threads. So, using the same percentage in both thread
pools means one of the pools' resource will be over allocated.

An alternative way is to simply model network and io thread pool together.
If you get 10 io threads and 5 network threads, you get 1500% request
processing power. A 50% limit means a total of 750% processing power. We
just add up the time a user request spent in either network or io thread.
If that total exceeds 750% (doesn't matter whether it's spent more in
network or io thread), the request will be throttled. This seems more
general and is not sensitive to the current implementation detail of having
a separate network and io thread pool. In the future, if the threading
model changes, the same concept of quota can still be applied. For now,
since it's a bit tricky to add the delay logic in the network thread pool,
we could probably just do the delaying only in the io threads as you
suggested earlier.

There is still the orthogonal question of whether a quota of 50% is out of
100% or 100% * #total processing threads. My feeling is that the latter is
slightly better based on my explanation earlier. The way to describe this
quota to the users can be "share of elapsed request processing time on a
single CPU" (similar to top).

Thanks,

Jun


On Fri, Mar 3, 2017 at 4:22 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Jun,
>
> Agree about the two scenarios.
>
> But still not sure about a single quota covering both network threads and
> I/O threads with per-thread quota. If there are 10 I/O threads and 5
> network threads and I want to assign half the quota to userA, the quota
> would be 750%. I imagine, internally, we would convert this to 500% for I/O
> and 250% for network threads to allocate 50% of each pool.
>
> A couple of scenarios:
>
> 1. Admin adds 1 extra network thread. To retain 50%, admin needs to now
> allocate 800% for each user. Or increase the quota for a few users. To me,
> it feels like admin needs to convert 50% to 800% and Kafka internally needs
> to convert 800% to (500%, 300%). Everyone using just 50% feels a lot
> simpler.
>
> 2. We decide to add some other thread to this list. Admin needs to know
> exactly how many threads form the maximum quota. And we can be changing
> this between broker versions as we add more to the list. Again a single
> overall percent would be a lot simpler.
>
> There were others who were unconvinced by a single percent from the initial
> proposal and were happier with thread units similar to CPU units, so I am
> ok with going with per-thread quotas (as units or percent). Just not sure
> it makes it easier for admin in all cases.
>
> Regards,
>
> Rajini
>
>
> On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Rajini,
> >
> > Consider modeling as n * 100% unit. For 2), the question is what's
> causing
> > the I/O threads to be saturated. It's unlikely that all users'
> utilization
> > have increased at the same. A more likely case is that a few isolated
> > users' utilization have increased. If so, after increasing the number of
> > threads, the admin just needs to adjust the quota for a few isolated
> users,
> > which is expected and is less work.
> >
> > Consider modeling as 1 * 100% unit. For 1), all users' quota need to be
> > adjusted, which is unexpected and is more work.
> >
> > So, to me, the n * 100% model seems more convenient.
> >
> > As for future extension to cover network thread utilization, I was
> thinking
> > that one way is to simply model the capacity as (n + m) * 100% unit,
> where
> > n and m are the number of network and i/o threads, respectively. Then,
> for
> > each user, we can just add up the utilization in the network and the i/o
> > thread. If we do this, we don't need a new type of quota.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > wrote:
> >
> > > Jun,
> > >
> > > If we use request.percentage as the percentage used in a single I/O
> > thread,
> > > the total percentage being allocated will be num.io.threads * 100 for
> I/O
> > > threads and num.network.threads * 100 for network threads. A single
> quota
> > > covering the two as a percentage wouldn't quite work if you want to
> > > allocate the same proportion in both cases. If we want to treat threads
> > as
> > > separate units, won't we need two quota configurations regardless of
> > > whether we use units or percentage? Perhaps I misunderstood your
> > > suggestion.
> > >
> > > I think there are two cases:
> > >
> > >    1. The use case that you mentioned where an admin is adding more
> users
> > >    and decides to add more I/O threads and expects to find free quota
> to
> > >    allocate for new users.
> > >    2. Admin adds more I/O threads because the I/O threads are saturated
> > and
> > >    there are cores available to allocate, even though the number or
> > >    users/clients hasn't changed.
> > >
> > > If we allocated treated I/O threads as a single unit of 100%, all user
> > > quotas need to be reallocated for 1). If we allocated I/O threads as n
> > > units with n*100%, all user quotas need to be reallocated for 2),
> > otherwise
> > > some of the new threads may just not be used. Either way it should be
> > easy
> > > to write a script to decrease/increase quotas by a multiple for all
> > users.
> > >
> > > So it really boils down to which quota unit is most intuitive in terms
> of
> > > configuration. And from the discussion so far, it feels like opinion is
> > > divided on whether quotas should be carved out of an absolute 100% (or
> 1
> > > unit) or be relative to the number of threads (n*100% or n units).
> > >
> > >
> > >
> > > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Another way to express an absolute limit is to use
> request.percentage,
> > > but
> > > > treat it as the percentage used in a single request handling thread.
> > For
> > > > now, the request handling threads can be just the io threads. In the
> > > > future, they can cover the network threads as well. This is similar
> to
> > > how
> > > > top reports CPU usage and may be a bit easier for people to
> understand.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Hi, Jay,
> > > > >
> > > > > 2. Regarding request.unit vs request.percentage. I started with
> > > > > request.percentage too. The reasoning for request.unit is the
> > > following.
> > > > > Suppose that the capacity has been reached on a broker and the
> admin
> > > > needs
> > > > > to add a new user. A simple way to increase the capacity is to
> > increase
> > > > the
> > > > > number of io threads, assuming there are still enough cores. If the
> > > limit
> > > > > is based on percentage, the additional capacity automatically gets
> > > > > distributed to existing users and we haven't really carved out any
> > > > > additional resource for the new user. Now, is it easy for a user to
> > > > reason
> > > > > about 0.1 unit vs 10%. My feeling is that both are hard and have to
> > be
> > > > > configured empirically. Not sure if percentage is obviously easier
> to
> > > > > reason about.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io>
> wrote:
> > > > >
> > > > >> A couple of quick points:
> > > > >>
> > > > >> 1. Even though the implementation of this quota is only using io
> > > thread
> > > > >> time, i think we should call it something like "request-time".
> This
> > > will
> > > > >> give us flexibility to improve the implementation to cover network
> > > > threads
> > > > >> in the future and will avoid exposing internal details like our
> > thread
> > > > >> pools on the server.
> > > > >>
> > > > >> 2. Jun/Roger, I get what you are trying to fix but the idea of
> > > > >> thread/units
> > > > >> is super unintuitive as a user-facing knob. I had to read the KIP
> > like
> > > > >> eight times to understand this. I'm not sure that your point that
> > > > >> increasing the number of threads is a problem with a
> > percentage-based
> > > > >> value, it really depends on whether the user thinks about the
> > > > "percentage
> > > > >> of request processing time" or "thread units". If they think "I
> have
> > > > >> allocated 10% of my request processing time to user x" then it is
> a
> > > bug
> > > > >> that increasing the thread count decreases that percent as it does
> > in
> > > > the
> > > > >> current proposal. As a practical matter I think the only way to
> > > actually
> > > > >> reason about this is as a percent---I just don't believe people
> are
> > > > going
> > > > >> to think, "ah, 4.3 thread units, that is the right amount!".
> > Instead I
> > > > >> think they have to understand this thread unit concept, figure out
> > > what
> > > > >> they have set in number of threads, compute a percent and then
> come
> > up
> > > > >> with
> > > > >> the number of thread units, and these will all be wrong if that
> > thread
> > > > >> count changes. I also think this ties us to throttling the I/O
> > thread
> > > > >> pool,
> > > > >> which may not be where we want to end up.
> > > > >>
> > > > >> 3. For what it's worth I do think having a single throttle_ms
> field
> > in
> > > > all
> > > > >> the responses that combines all throttling from all quotas is
> > probably
> > > > the
> > > > >> simplest. There could be a use case for having separate fields for
> > > each,
> > > > >> but I think that is actually harder to use/monitor in the common
> > case
> > > so
> > > > >> unless someone has a use case I think just one should be fine.
> > > > >>
> > > > >> -Jay
> > > > >>
> > > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com>
> > > > >> wrote:
> > > > >>
> > > > >> > I have updated the KIP based on the discussions so far.
> > > > >> >
> > > > >> >
> > > > >> > Regards,
> > > > >> >
> > > > >> > Rajini
> > > > >> >
> > > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > > >> rajinisivaram@gmail.com>
> > > > >> > wrote:
> > > > >> >
> > > > >> > > Thank you all for the feedback.
> > > > >> > >
> > > > >> > > Ismael #1. It makes sense not to throttle inter-broker
> requests
> > > like
> > > > >> > > LeaderAndIsr etc. The simplest way to ensure that clients
> cannot
> > > use
> > > > >> > these
> > > > >> > > requests to bypass quotas for DoS attacks is to ensure that
> ACLs
> > > > >> prevent
> > > > >> > > clients from using these requests and unauthorized requests
> are
> > > > >> included
> > > > >> > > towards quotas.
> > > > >> > >
> > > > >> > > Ismael #2, Jay #1 : I was thinking that these quotas can
> return
> > a
> > > > >> > separate
> > > > >> > > throttle time, and all utilization based quotas could use the
> > same
> > > > >> field
> > > > >> > > (we won't add another one for network thread utilization for
> > > > >> instance).
> > > > >> > But
> > > > >> > > perhaps it makes sense to keep byte rate quotas separate in
> > > > >> produce/fetch
> > > > >> > > responses to provide separate metrics? Agree with Ismael that
> > the
> > > > >> name of
> > > > >> > > the existing field should be changed if we have two. Happy to
> > > switch
> > > > >> to a
> > > > >> > > single combined throttle time if that is sufficient.
> > > > >> > >
> > > > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot separated
> name
> > > for
> > > > >> new
> > > > >> > > property. Replication quotas use dot separated, so it will be
> > > > >> consistent
> > > > >> > > with all properties except byte rate quotas.
> > > > >> > >
> > > > >> > > Radai: #1 Request processing time rather than request rate
> were
> > > > chosen
> > > > >> > > because the time per request can vary significantly between
> > > requests
> > > > >> as
> > > > >> > > mentioned in the discussion and KIP.
> > > > >> > > #2 Two separate quotas for heartbeats/regular requests feel
> like
> > > > more
> > > > >> > > configuration and more metrics. Since most users would set
> > quotas
> > > > >> higher
> > > > >> > > than the expected usage and quotas are more of a safety net, a
> > > > single
> > > > >> > quota
> > > > >> > > should work in most cases.
> > > > >> > >  #3 The number of requests in purgatory is limited by the
> number
> > > of
> > > > >> > active
> > > > >> > > connections since only one request per connection will be
> > > throttled
> > > > >> at a
> > > > >> > > time.
> > > > >> > > #4 As with byte rate quotas, to use the full allocated quotas,
> > > > >> > > clients/users would need to use partitions that are
> distributed
> > > > across
> > > > >> > the
> > > > >> > > cluster. The alternative of using cluster-wide quotas instead
> of
> > > > >> > per-broker
> > > > >> > > quotas would be far too complex to implement.
> > > > >> > >
> > > > >> > > Dong : We currently have two ClientQuotaManagers for quota
> types
> > > > Fetch
> > > > >> > and
> > > > >> > > Produce. A new one will be added for IOThread, which manages
> > > quotas
> > > > >> for
> > > > >> > I/O
> > > > >> > > thread utilization. This will not update the Fetch or Produce
> > > > >> queue-size,
> > > > >> > > but will have a separate metric for the queue-size.  I wasn't
> > > > >> planning to
> > > > >> > > add any additional metrics apart from the equivalent ones for
> > > > existing
> > > > >> > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
> > > > >> utilization
> > > > >> > > could be slightly misleading since it depends on the sequence
> of
> > > > >> > requests.
> > > > >> > > But we can look into more metrics after the KIP is implemented
> > if
> > > > >> > required.
> > > > >> > >
> > > > >> > > I think we need to limit the maximum delay since all requests
> > are
> > > > >> > > throttled. If a client has a quota of 0.001 units and a single
> > > > request
> > > > >> > used
> > > > >> > > 50ms, we don't want to delay all requests from the client by
> 50
> > > > >> seconds,
> > > > >> > > throwing the client out of all its consumer groups. The issue
> is
> > > > only
> > > > >> if
> > > > >> > a
> > > > >> > > user is allocated a quota that is insufficient to process one
> > > large
> > > > >> > > request. The expectation is that the units allocated per user
> > will
> > > > be
> > > > >> > much
> > > > >> > > higher than the time taken to process one request and the
> limit
> > > > should
> > > > >> > > seldom be applied. Agree this needs proper documentation.
> > > > >> > >
> > > > >> > > Regards,
> > > > >> > >
> > > > >> > > Rajini
> > > > >> > >
> > > > >> > >
> > > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > > radai.rosenblatt@gmail.com>
> > > > >> > wrote:
> > > > >> > >
> > > > >> > >> @jun: i wasnt concerned about tying up a request processing
> > > thread,
> > > > >> but
> > > > >> > >> IIUC the code does still read the entire request out, which
> > might
> > > > >> add-up
> > > > >> > >> to
> > > > >> > >> a non-negligible amount of memory.
> > > > >> > >>
> > > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> > lindong28@gmail.com>
> > > > >> wrote:
> > > > >> > >>
> > > > >> > >> > Hey Rajini,
> > > > >> > >> >
> > > > >> > >> > The current KIP says that the maximum delay will be reduced
> > to
> > > > >> window
> > > > >> > >> size
> > > > >> > >> > if it is larger than the window size. I have a concern with
> > > this:
> > > > >> > >> >
> > > > >> > >> > 1) This essentially means that the user is allowed to
> exceed
> > > > their
> > > > >> > quota
> > > > >> > >> > over a long period of time. Can you provide an upper bound
> on
> > > > this
> > > > >> > >> > deviation?
> > > > >> > >> >
> > > > >> > >> > 2) What is the motivation for cap the maximum delay by the
> > > window
> > > > >> > size?
> > > > >> > >> I
> > > > >> > >> > am wondering if there is better alternative to address the
> > > > problem.
> > > > >> > >> >
> > > > >> > >> > 3) It means that the existing metric-related config will
> > have a
> > > > >> more
> > > > >> > >> > directly impact on the mechanism of this
> io-thread-unit-based
> > > > >> quota.
> > > > >> > The
> > > > >> > >> > may be an important change depending on the answer to 1)
> > above.
> > > > We
> > > > >> > >> probably
> > > > >> > >> > need to document this more explicitly.
> > > > >> > >> >
> > > > >> > >> > Dong
> > > > >> > >> >
> > > > >> > >> >
> > > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> > > lindong28@gmail.com>
> > > > >> > wrote:
> > > > >> > >> >
> > > > >> > >> > > Hey Jun,
> > > > >> > >> > >
> > > > >> > >> > > Yeah you are right. I thought it wasn't because at
> LinkedIn
> > > it
> > > > >> will
> > > > >> > be
> > > > >> > >> > too
> > > > >> > >> > > much pressure on inGraph to expose those per-clientId
> > metrics
> > > > so
> > > > >> we
> > > > >> > >> ended
> > > > >> > >> > > up printing them periodically to local log. Never mind if
> > it
> > > is
> > > > >> not
> > > > >> > a
> > > > >> > >> > > general problem.
> > > > >> > >> > >
> > > > >> > >> > > Hey Rajini,
> > > > >> > >> > >
> > > > >> > >> > > - I agree with Jay that we probably don't want to add a
> new
> > > > field
> > > > >> > for
> > > > >> > >> > > every quota ProduceResponse or FetchResponse. Is there
> any
> > > > >> use-case
> > > > >> > >> for
> > > > >> > >> > > having separate throttle-time fields for byte-rate-quota
> > and
> > > > >> > >> > > io-thread-unit-quota? You probably need to document this
> as
> > > > >> > interface
> > > > >> > >> > > change if you plan to add new field in any request.
> > > > >> > >> > >
> > > > >> > >> > > - I don't think IOThread belongs to quotaType. The
> existing
> > > > quota
> > > > >> > >> types
> > > > >> > >> > > (i.e. Produce/Fetch/LeaderReplicatio
> n/FollowerReplication)
> > > > >> identify
> > > > >> > >> the
> > > > >> > >> > > type of request that are throttled, not the quota
> mechanism
> > > > that
> > > > >> is
> > > > >> > >> > applied.
> > > > >> > >> > >
> > > > >> > >> > > - If a request is throttled due to this
> > io-thread-unit-based
> > > > >> quota,
> > > > >> > is
> > > > >> > >> > the
> > > > >> > >> > > existing queue-size metric in ClientQuotaManager
> > incremented?
> > > > >> > >> > >
> > > > >> > >> > > - In the interest of providing guide line for admin to
> > decide
> > > > >> > >> > > io-thread-unit-based quota and for user to understand its
> > > > impact
> > > > >> on
> > > > >> > >> their
> > > > >> > >> > > traffic, would it be useful to have a metric that shows
> the
> > > > >> overall
> > > > >> > >> > > byte-rate per io-thread-unit? Can we also show this a
> > > > >> per-clientId
> > > > >> > >> > metric?
> > > > >> > >> > >
> > > > >> > >> > > Thanks,
> > > > >> > >> > > Dong
> > > > >> > >> > >
> > > > >> > >> > >
> > > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <
> jun@confluent.io
> > >
> > > > >> wrote:
> > > > >> > >> > >
> > > > >> > >> > >> Hi, Ismael,
> > > > >> > >> > >>
> > > > >> > >> > >> For #3, typically, an admin won't configure more io
> > threads
> > > > than
> > > > >> > CPU
> > > > >> > >> > >> cores,
> > > > >> > >> > >> but it's possible for an admin to start with fewer io
> > > threads
> > > > >> than
> > > > >> > >> cores
> > > > >> > >> > >> and grow that later on.
> > > > >> > >> > >>
> > > > >> > >> > >> Hi, Dong,
> > > > >> > >> > >>
> > > > >> > >> > >> I think the throttleTime sensor on the broker tells the
> > > admin
> > > > >> > >> whether a
> > > > >> > >> > >> user/clentId is throttled or not.
> > > > >> > >> > >>
> > > > >> > >> > >> Hi, Radi,
> > > > >> > >> > >>
> > > > >> > >> > >> The reasoning for delaying the throttled requests on the
> > > > broker
> > > > >> > >> instead
> > > > >> > >> > of
> > > > >> > >> > >> returning an error immediately is that the latter has no
> > way
> > > > to
> > > > >> > >> prevent
> > > > >> > >> > >> the
> > > > >> > >> > >> client from retrying immediately, which will make things
> > > > worse.
> > > > >> The
> > > > >> > >> > >> delaying logic is based off a delay queue. A separate
> > > > expiration
> > > > >> > >> thread
> > > > >> > >> > >> just waits on the next to be expired request. So, it
> > doesn't
> > > > tie
> > > > >> > up a
> > > > >> > >> > >> request handler thread.
> > > > >> > >> > >>
> > > > >> > >> > >> Thanks,
> > > > >> > >> > >>
> > > > >> > >> > >> Jun
> > > > >> > >> > >>
> > > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> > > > ismael@juma.me.uk
> > > > >> >
> > > > >> > >> wrote:
> > > > >> > >> > >>
> > > > >> > >> > >> > Hi Jay,
> > > > >> > >> > >> >
> > > > >> > >> > >> > Regarding 1, I definitely like the simplicity of
> > keeping a
> > > > >> single
> > > > >> > >> > >> throttle
> > > > >> > >> > >> > time field in the response. The downside is that the
> > > client
> > > > >> > metrics
> > > > >> > >> > >> will be
> > > > >> > >> > >> > more coarse grained.
> > > > >> > >> > >> >
> > > > >> > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.
> > > > percentage`
> > > > >> > and
> > > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > > >> > >> > >> >
> > > > >> > >> > >> > Ismael
> > > > >> > >> > >> >
> > > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> > > > jay@confluent.io>
> > > > >> > >> wrote:
> > > > >> > >> > >> >
> > > > >> > >> > >> > > A few minor comments:
> > > > >> > >> > >> > >
> > > > >> > >> > >> > >    1. Isn't it the case that the throttling time
> > > response
> > > > >> field
> > > > >> > >> > should
> > > > >> > >> > >> > have
> > > > >> > >> > >> > >    the total time your request was throttled
> > > irrespective
> > > > of
> > > > >> > the
> > > > >> > >> > >> quotas
> > > > >> > >> > >> > > that
> > > > >> > >> > >> > >    caused that. Limiting it to byte rate quota
> doesn't
> > > > make
> > > > >> > >> sense,
> > > > >> > >> > >> but I
> > > > >> > >> > >> > > also
> > > > >> > >> > >> > >    I don't think we want to end up adding new fields
> > in
> > > > the
> > > > >> > >> response
> > > > >> > >> > >> for
> > > > >> > >> > >> > > every
> > > > >> > >> > >> > >    single thing we quota, right?
> > > > >> > >> > >> > >    2. I don't think we should make this quota
> > > specifically
> > > > >> > about
> > > > >> > >> io
> > > > >> > >> > >> > >    threads. Once we introduce these quotas people
> set
> > > them
> > > > >> and
> > > > >> > >> > expect
> > > > >> > >> > >> > them
> > > > >> > >> > >> > > to
> > > > >> > >> > >> > >    be enforced (and if they aren't it may cause an
> > > > outage).
> > > > >> As
> > > > >> > a
> > > > >> > >> > >> result
> > > > >> > >> > >> > > they
> > > > >> > >> > >> > >    are a bit more sensitive than normal configs, I
> > > think.
> > > > >> The
> > > > >> > >> > current
> > > > >> > >> > >> > > thread
> > > > >> > >> > >> > >    pools seem like something of an implementation
> > detail
> > > > and
> > > > >> > not
> > > > >> > >> the
> > > > >> > >> > >> > level
> > > > >> > >> > >> > > the
> > > > >> > >> > >> > >    user-facing quotas should be involved with. I
> think
> > > it
> > > > >> might
> > > > >> > >> be
> > > > >> > >> > >> better
> > > > >> > >> > >> > > to
> > > > >> > >> > >> > >    make this a general request-time throttle with no
> > > > >> mention in
> > > > >> > >> the
> > > > >> > >> > >> > naming
> > > > >> > >> > >> > >    about I/O threads and simply acknowledge the
> > current
> > > > >> > >> limitation
> > > > >> > >> > >> (which
> > > > >> > >> > >> > > we
> > > > >> > >> > >> > >    may someday fix) in the docs that this covers
> only
> > > the
> > > > >> time
> > > > >> > >> after
> > > > >> > >> > >> the
> > > > >> > >> > >> > >    thread is read off the network.
> > > > >> > >> > >> > >    3. As such I think the right interface to the
> user
> > > > would
> > > > >> be
> > > > >> > >> > >> something
> > > > >> > >> > >> > >    like percent_request_time and be in {0,...100} or
> > > > >> > >> > >> request_time_ratio
> > > > >> > >> > >> > > and be
> > > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the
> > terminology
> > > we
> > > > >> used
> > > > >> > >> if
> > > > >> > >> > the
> > > > >> > >> > >> > > scale
> > > > >> > >> > >> > >    is between 0 and 1 in the other metrics, right?)
> > > > >> > >> > >> > >
> > > > >> > >> > >> > > -Jay
> > > > >> > >> > >> > >
> > > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > > > >> > >> > >> rajinisivaram@gmail.com
> > > > >> > >> > >> > >
> > > > >> > >> > >> > > wrote:
> > > > >> > >> > >> > >
> > > > >> > >> > >> > > > Guozhang/Dong,
> > > > >> > >> > >> > > >
> > > > >> > >> > >> > > > Thank you for the feedback.
> > > > >> > >> > >> > > >
> > > > >> > >> > >> > > > Guozhang : I have updated the section on
> > co-existence
> > > of
> > > > >> byte
> > > > >> > >> rate
> > > > >> > >> > >> and
> > > > >> > >> > >> > > > request time quotas.
> > > > >> > >> > >> > > >
> > > > >> > >> > >> > > > Dong: I hadn't added much detail to the metrics
> and
> > > > >> sensors
> > > > >> > >> since
> > > > >> > >> > >> they
> > > > >> > >> > >> > > are
> > > > >> > >> > >> > > > going to be very similar to the existing metrics
> and
> > > > >> sensors.
> > > > >> > >> To
> > > > >> > >> > >> avoid
> > > > >> > >> > >> > > > confusion, I have now added more detail. All
> metrics
> > > are
> > > > >> in
> > > > >> > the
> > > > >> > >> > >> group
> > > > >> > >> > >> > > > "quotaType" and all sensors have names starting
> with
> > > > >> > >> "quotaType"
> > > > >> > >> > >> (where
> > > > >> > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > > >> > >> > >> > > > So there will be no reuse of existing
> > metrics/sensors.
> > > > The
> > > > >> > new
> > > > >> > >> > ones
> > > > >> > >> > >> for
> > > > >> > >> > >> > > > request processing time based throttling will be
> > > > >> completely
> > > > >> > >> > >> independent
> > > > >> > >> > >> > > of
> > > > >> > >> > >> > > > existing metrics/sensors, but will be consistent
> in
> > > > >> format.
> > > > >> > >> > >> > > >
> > > > >> > >> > >> > > > The existing throttle_time_ms field in
> produce/fetch
> > > > >> > responses
> > > > >> > >> > will
> > > > >> > >> > >> not
> > > > >> > >> > >> > > be
> > > > >> > >> > >> > > > impacted by this KIP. That will continue to return
> > > > >> byte-rate
> > > > >> > >> based
> > > > >> > >> > >> > > > throttling times. In addition, a new field
> > > > >> > >> > request_throttle_time_ms
> > > > >> > >> > >> > will
> > > > >> > >> > >> > > be
> > > > >> > >> > >> > > > added to return request quota based throttling
> > times.
> > > > >> These
> > > > >> > >> will
> > > > >> > >> > be
> > > > >> > >> > >> > > exposed
> > > > >> > >> > >> > > > as new metrics on the client-side.
> > > > >> > >> > >> > > >
> > > > >> > >> > >> > > > Since all metrics and sensors are different for
> each
> > > > type
> > > > >> of
> > > > >> > >> > quota,
> > > > >> > >> > >> I
> > > > >> > >> > >> > > > believe there is already sufficient metrics to
> > monitor
> > > > >> > >> throttling
> > > > >> > >> > on
> > > > >> > >> > >> > both
> > > > >> > >> > >> > > > client and broker side for each type of
> throttling.
> > > > >> > >> > >> > > >
> > > > >> > >> > >> > > > Regards,
> > > > >> > >> > >> > > >
> > > > >> > >> > >> > > > Rajini
> > > > >> > >> > >> > > >
> > > > >> > >> > >> > > >
> > > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > > > >> > lindong28@gmail.com
> > > > >> > >> >
> > > > >> > >> > >> wrote:
> > > > >> > >> > >> > > >
> > > > >> > >> > >> > > > > Hey Rajini,
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > > I think it makes a lot of sense to use
> > > io_thread_units
> > > > >> as
> > > > >> > >> metric
> > > > >> > >> > >> to
> > > > >> > >> > >> > > quota
> > > > >> > >> > >> > > > > user's traffic here. LGTM overall. I have some
> > > > questions
> > > > >> > >> > regarding
> > > > >> > >> > >> > > > sensors.
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > > - Can you be more specific in the KIP what
> sensors
> > > > will
> > > > >> be
> > > > >> > >> > added?
> > > > >> > >> > >> For
> > > > >> > >> > >> > > > > example, it will be useful to specify the name
> and
> > > > >> > >> attributes of
> > > > >> > >> > >> > these
> > > > >> > >> > >> > > > new
> > > > >> > >> > >> > > > > sensors.
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > > - We currently have throttle-time and queue-size
> > for
> > > > >> > >> byte-rate
> > > > >> > >> > >> based
> > > > >> > >> > >> > > > quota.
> > > > >> > >> > >> > > > > Are you going to have separate throttle-time and
> > > > >> queue-size
> > > > >> > >> for
> > > > >> > >> > >> > > requests
> > > > >> > >> > >> > > > > throttled by io_thread_unit-based quota, or will
> > > they
> > > > >> share
> > > > >> > >> the
> > > > >> > >> > >> same
> > > > >> > >> > >> > > > > sensor?
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > > - Does the throttle-time in the ProduceResponse
> > and
> > > > >> > >> > FetchResponse
> > > > >> > >> > >> > > > contains
> > > > >> > >> > >> > > > > time due to io_thread_unit-based quota?
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > > - Currently kafka server doesn't not provide any
> > log
> > > > or
> > > > >> > >> metrics
> > > > >> > >> > >> that
> > > > >> > >> > >> > > > tells
> > > > >> > >> > >> > > > > whether any given clientId (or user) is
> throttled.
> > > > This
> > > > >> is
> > > > >> > >> not
> > > > >> > >> > too
> > > > >> > >> > >> > bad
> > > > >> > >> > >> > > > > because we can still check the client-side
> > byte-rate
> > > > >> metric
> > > > >> > >> to
> > > > >> > >> > >> > validate
> > > > >> > >> > >> > > > > whether a given client is throttled. But with
> this
> > > > >> > >> > io_thread_unit,
> > > > >> > >> > >> > > there
> > > > >> > >> > >> > > > > will be no way to validate whether a given
> client
> > is
> > > > >> slow
> > > > >> > >> > because
> > > > >> > >> > >> it
> > > > >> > >> > >> > > has
> > > > >> > >> > >> > > > > exceeded its io_thread_unit limit. It is
> necessary
> > > for
> > > > >> user
> > > > >> > >> to
> > > > >> > >> > be
> > > > >> > >> > >> > able
> > > > >> > >> > >> > > to
> > > > >> > >> > >> > > > > know this information to figure how whether they
> > > have
> > > > >> > reached
> > > > >> > >> > >> there
> > > > >> > >> > >> > > quota
> > > > >> > >> > >> > > > > limit. How about we add log4j log on the server
> > side
> > > > to
> > > > >> > >> > >> periodically
> > > > >> > >> > >> > > > print
> > > > >> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > > > >> > >> > >> > io-thread-unit-throttle-time)
> > > > >> > >> > >> > > so
> > > > >> > >> > >> > > > > that kafka administrator can figure those users
> > that
> > > > >> have
> > > > >> > >> > reached
> > > > >> > >> > >> > their
> > > > >> > >> > >> > > > > limit and act accordingly?
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > > Thanks,
> > > > >> > >> > >> > > > > Dong
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> > > > >> > >> > >> wangguoz@gmail.com>
> > > > >> > >> > >> > > > wrote:
> > > > >> > >> > >> > > > >
> > > > >> > >> > >> > > > > > Made a pass over the doc, overall LGTM except
> a
> > > > minor
> > > > >> > >> comment
> > > > >> > >> > on
> > > > >> > >> > >> > the
> > > > >> > >> > >> > > > > > throttling implementation:
> > > > >> > >> > >> > > > > >
> > > > >> > >> > >> > > > > > Stated as "Request processing time throttling
> > will
> > > > be
> > > > >> > >> applied
> > > > >> > >> > on
> > > > >> > >> > >> > top
> > > > >> > >> > >> > > if
> > > > >> > >> > >> > > > > > necessary." I thought that it meant the
> request
> > > > >> > processing
> > > > >> > >> > time
> > > > >> > >> > >> > > > > throttling
> > > > >> > >> > >> > > > > > is applied first, but continue reading I found
> > it
> > > > >> > actually
> > > > >> > >> > >> meant to
> > > > >> > >> > >> > > > apply
> > > > >> > >> > >> > > > > > produce / fetch byte rate throttling first.
> > > > >> > >> > >> > > > > >
> > > > >> > >> > >> > > > > > Also the last sentence "The remaining delay if
> > any
> > > > is
> > > > >> > >> applied
> > > > >> > >> > to
> > > > >> > >> > >> > the
> > > > >> > >> > >> > > > > > response." is a bit confusing to me. Maybe
> > > rewording
> > > > >> it a
> > > > >> > >> bit?
> > > > >> > >> > >> > > > > >
> > > > >> > >> > >> > > > > >
> > > > >> > >> > >> > > > > > Guozhang
> > > > >> > >> > >> > > > > >
> > > > >> > >> > >> > > > > >
> > > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> > > > >> > jun@confluent.io
> > > > >> > >> >
> > > > >> > >> > >> wrote:
> > > > >> > >> > >> > > > > >
> > > > >> > >> > >> > > > > > > Hi, Rajini,
> > > > >> > >> > >> > > > > > >
> > > > >> > >> > >> > > > > > > Thanks for the updated KIP. The latest
> > proposal
> > > > >> looks
> > > > >> > >> good
> > > > >> > >> > to
> > > > >> > >> > >> me.
> > > > >> > >> > >> > > > > > >
> > > > >> > >> > >> > > > > > > Jun
> > > > >> > >> > >> > > > > > >
> > > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini
> > Sivaram
> > > <
> > > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > > >> > >> > >> > > > > > >
> > > > >> > >> > >> > > > > > > wrote:
> > > > >> > >> > >> > > > > > >
> > > > >> > >> > >> > > > > > > > Jun/Roger,
> > > > >> > >> > >> > > > > > > >
> > > > >> > >> > >> > > > > > > > Thank you for the feedback.
> > > > >> > >> > >> > > > > > > >
> > > > >> > >> > >> > > > > > > > 1. I have updated the KIP to use absolute
> > > units
> > > > >> > >> instead of
> > > > >> > >> > >> > > > > percentage.
> > > > >> > >> > >> > > > > > > The
> > > > >> > >> > >> > > > > > > > property is called* io_thread_units* to
> > align
> > > > with
> > > > >> > the
> > > > >> > >> > >> thread
> > > > >> > >> > >> > > count
> > > > >> > >> > >> > > > > > > > property *num.io.threads*. When we
> implement
> > > > >> network
> > > > >> > >> > thread
> > > > >> > >> > >> > > > > utilization
> > > > >> > >> > >> > > > > > > > quotas, we can add another property
> > > > >> > >> > *network_thread_units.*
> > > > >> > >> > >> > > > > > > >
> > > > >> > >> > >> > > > > > > > 2. ControlledShutdown is already listed
> > under
> > > > the
> > > > >> > >> exempt
> > > > >> > >> > >> > > requests.
> > > > >> > >> > >> > > > > Jun,
> > > > >> > >> > >> > > > > > > did
> > > > >> > >> > >> > > > > > > > you mean a different request that needs to
> > be
> > > > >> added?
> > > > >> > >> The
> > > > >> > >> > >> four
> > > > >> > >> > >> > > > > requests
> > > > >> > >> > >> > > > > > > > currently exempt in the KIP are
> StopReplica,
> > > > >> > >> > >> > ControlledShutdown,
> > > > >> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are
> > > > >> controlled
> > > > >> > >> > using
> > > > >> > >> > >> > > > > > ClusterAction
> > > > >> > >> > >> > > > > > > > ACL, so it is easy to exclude and only
> > > throttle
> > > > if
> > > > >> > >> > >> > unauthorized.
> > > > >> > >> > >> > > I
> > > > >> > >> > >> > > > > > wasn't
> > > > >> > >> > >> > > > > > > > sure if there are other requests used only
> > for
> > > > >> > >> > inter-broker
> > > > >> > >> > >> > that
> > > > >> > >> > >> > > > > needed
> > > > >> > >> > >> > > > > > > to
> > > > >> > >> > >> > > > > > > > be excluded.
> > > > >> > >> > >> > > > > > > >
> > > > >> > >> > >> > > > > > > > 3. I was thinking the smallest change
> would
> > be
> > > > to
> > > > >> > >> replace
> > > > >> > >> > >> all
> > > > >> > >> > >> > > > > > references
> > > > >> > >> > >> > > > > > > to
> > > > >> > >> > >> > > > > > > > *requestChannel.sendResponse()* with a
> > local
> > > > >> method
> > > > >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does
> the
> > > > >> > throttling
> > > > >> > >> if
> > > > >> > >> > >> any
> > > > >> > >> > >> > > plus
> > > > >> > >> > >> > > > > send
> > > > >> > >> > >> > > > > > > > response. If we throttle first in
> > > > >> > *KafkaApis.handle()*,
> > > > >> > >> > the
> > > > >> > >> > >> > time
> > > > >> > >> > >> > > > > spent
> > > > >> > >> > >> > > > > > > > within the method handling the request
> will
> > > not
> > > > be
> > > > >> > >> > recorded
> > > > >> > >> > >> or
> > > > >> > >> > >> > > used
> > > > >> > >> > >> > > > > in
> > > > >> > >> > >> > > > > > > > throttling. We can look into this again
> when
> > > the
> > > > >> PR
> > > > >> > is
> > > > >> > >> > ready
> > > > >> > >> > >> > for
> > > > >> > >> > >> > > > > > review.
> > > > >> > >> > >> > > > > > > >
> > > > >> > >> > >> > > > > > > > Regards,
> > > > >> > >> > >> > > > > > > >
> > > > >> > >> > >> > > > > > > > Rajini
> > > > >> > >> > >> > > > > > > >
> > > > >> > >> > >> > > > > > > >
> > > > >> > >> > >> > > > > > > >
> > > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger
> > Hoover
> > > <
> > > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > > >> > >> > >> > > > > > > > wrote:
> > > > >> > >> > >> > > > > > > >
> > > > >> > >> > >> > > > > > > > > Great to see this KIP and the excellent
> > > > >> discussion.
> > > > >> > >> > >> > > > > > > > >
> > > > >> > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If
> > my
> > > > >> > >> application
> > > > >> > >> > is
> > > > >> > >> > >> > > > > allocated
> > > > >> > >> > >> > > > > > 1
> > > > >> > >> > >> > > > > > > > > request handler unit, then it's as if I
> > > have a
> > > > >> > Kafka
> > > > >> > >> > >> broker
> > > > >> > >> > >> > > with
> > > > >> > >> > >> > > > a
> > > > >> > >> > >> > > > > > > single
> > > > >> > >> > >> > > > > > > > > request handler thread dedicated to me.
> > > > That's
> > > > >> the
> > > > >> > >> > most I
> > > > >> > >> > >> > can
> > > > >> > >> > >> > > > use,
> > > > >> > >> > >> > > > > > at
> > > > >> > >> > >> > > > > > > > > least.  That allocation doesn't change
> > even
> > > if
> > > > >> an
> > > > >> > >> admin
> > > > >> > >> > >> later
> > > > >> > >> > >> > > > > > increases
> > > > >> > >> > >> > > > > > > > the
> > > > >> > >> > >> > > > > > > > > size of the request thread pool on the
> > > broker.
> > > > >> > It's
> > > > >> > >> > >> similar
> > > > >> > >> > >> > to
> > > > >> > >> > >> > > > the
> > > > >> > >> > >> > > > > > CPU
> > > > >> > >> > >> > > > > > > > > abstraction that VMs and containers get
> > from
> > > > >> > >> hypervisors
> > > > >> > >> > >> or
> > > > >> > >> > >> > OS
> > > > >> > >> > >> > > > > > > > schedulers.
> > > > >> > >> > >> > > > > > > > > While different client access patterns
> can
> > > use
> > > > >> > wildly
> > > > >> > >> > >> > different
> > > > >> > >> > >> > > > > > amounts
> > > > >> > >> > >> > > > > > > > of
> > > > >> > >> > >> > > > > > > > > request thread resources per request, a
> > > given
> > > > >> > >> > application
> > > > >> > >> > >> > will
> > > > >> > >> > >> > > > > > > generally
> > > > >> > >> > >> > > > > > > > > have a stable access pattern and can
> > figure
> > > > out
> > > > >> > >> > >> empirically
> > > > >> > >> > >> > how
> > > > >> > >> > >> > > > > many
> > > > >> > >> > >> > > > > > > > > "request thread units" it needs to meet
> > it's
> > > > >> > >> > >> > throughput/latency
> > > > >> > >> > >> > > > > > goals.
> > > > >> > >> > >> > > > > > > > >
> > > > >> > >> > >> > > > > > > > > Cheers,
> > > > >> > >> > >> > > > > > > > >
> > > > >> > >> > >> > > > > > > > > Roger
> > > > >> > >> > >> > > > > > > > >
> > > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun
> Rao <
> > > > >> > >> > >> jun@confluent.io>
> > > > >> > >> > >> > > > wrote:
> > > > >> > >> > >> > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > > >> > >> > >> > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > Thanks for the updated KIP. A few more
> > > > >> comments.
> > > > >> > >> > >> > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > 1. A concern of request_time_percent
> is
> > > that
> > > > >> it's
> > > > >> > >> not
> > > > >> > >> > an
> > > > >> > >> > >> > > > absolute
> > > > >> > >> > >> > > > > > > > value.
> > > > >> > >> > >> > > > > > > > > > Let's say you give a user a 10% limit.
> > If
> > > > the
> > > > >> > admin
> > > > >> > >> > >> doubles
> > > > >> > >> > >> > > the
> > > > >> > >> > >> > > > > > > number
> > > > >> > >> > >> > > > > > > > of
> > > > >> > >> > >> > > > > > > > > > request handler threads, that user now
> > > > >> actually
> > > > >> > has
> > > > >> > >> > >> twice
> > > > >> > >> > >> > the
> > > > >> > >> > >> > > > > > > absolute
> > > > >> > >> > >> > > > > > > > > > capacity. This may confuse people a
> bit.
> > > So,
> > > > >> > >> perhaps
> > > > >> > >> > >> > setting
> > > > >> > >> > >> > > > the
> > > > >> > >> > >> > > > > > > quota
> > > > >> > >> > >> > > > > > > > > > based on an absolute request thread
> unit
> > > is
> > > > >> > better.
> > > > >> > >> > >> > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also
> an
> > > > >> > >> inter-broker
> > > > >> > >> > >> > request
> > > > >> > >> > >> > > > and
> > > > >> > >> > >> > > > > > > needs
> > > > >> > >> > >> > > > > > > > to
> > > > >> > >> > >> > > > > > > > > > be excluded from throttling.
> > > > >> > >> > >> > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > 3. Implementation wise, I am wondering
> > if
> > > > it's
> > > > >> > >> simpler
> > > > >> > >> > >> to
> > > > >> > >> > >> > > apply
> > > > >> > >> > >> > > > > the
> > > > >> > >> > >> > > > > > > > > request
> > > > >> > >> > >> > > > > > > > > > time throttling first in
> > > KafkaApis.handle().
> > > > >> > >> > Otherwise,
> > > > >> > >> > >> we
> > > > >> > >> > >> > > will
> > > > >> > >> > >> > > > > > need
> > > > >> > >> > >> > > > > > > to
> > > > >> > >> > >> > > > > > > > > add
> > > > >> > >> > >> > > > > > > > > > the throttling logic in each type of
> > > > request.
> > > > >> > >> > >> > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > Thanks,
> > > > >> > >> > >> > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > Jun
> > > > >> > >> > >> > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM,
> Rajini
> > > > >> Sivaram <
> > > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > > >> > >> > >> > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > wrote:
> > > > >> > >> > >> > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > Jun,
> > > > >> > >> > >> > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > Thank you for the review.
> > > > >> > >> > >> > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > I have reverted to the original KIP
> > that
> > > > >> > >> throttles
> > > > >> > >> > >> based
> > > > >> > >> > >> > on
> > > > >> > >> > >> > > > > > request
> > > > >> > >> > >> > > > > > > > > > handler
> > > > >> > >> > >> > > > > > > > > > > utilization. At the moment, it uses
> > > > >> percentage,
> > > > >> > >> but
> > > > >> > >> > I
> > > > >> > >> > >> am
> > > > >> > >> > >> > > > happy
> > > > >> > >> > >> > > > > to
> > > > >> > >> > >> > > > > > > > > change
> > > > >> > >> > >> > > > > > > > > > to
> > > > >> > >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100)
> > if
> > > > >> > >> required. I
> > > > >> > >> > >> have
> > > > >> > >> > >> > > > added
> > > > >> > >> > >> > > > > > the
> > > > >> > >> > >> > > > > > > > > > examples
> > > > >> > >> > >> > > > > > > > > > > from this discussion to the KIP.
> Also
> > > > added
> > > > >> a
> > > > >> > >> > "Future
> > > > >> > >> > >> > Work"
> > > > >> > >> > >> > > > > > section
> > > > >> > >> > >> > > > > > > > to
> > > > >> > >> > >> > > > > > > > > > > address network thread utilization.
> > The
> > > > >> > >> > configuration
> > > > >> > >> > >> is
> > > > >> > >> > >> > > > named
> > > > >> > >> > >> > > > > > > > > > > "request_time_percent" with the
> > > > expectation
> > > > >> > that
> > > > >> > >> it
> > > > >> > >> > >> can
> > > > >> > >> > >> > > also
> > > > >> > >> > >> > > > be
> > > > >> > >> > >> > > > > > > used
> > > > >> > >> > >> > > > > > > > as
> > > > >> > >> > >> > > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > limit for network thread utilization
> > > when
> > > > >> that
> > > > >> > is
> > > > >> > >> > >> > > > implemented,
> > > > >> > >> > >> > > > > so
> > > > >> > >> > >> > > > > > > > that
> > > > >> > >> > >> > > > > > > > > > > users have to set only one config
> for
> > > the
> > > > >> two
> > > > >> > and
> > > > >> > >> > not
> > > > >> > >> > >> > have
> > > > >> > >> > >> > > to
> > > > >> > >> > >> > > > > > worry
> > > > >> > >> > >> > > > > > > > > about
> > > > >> > >> > >> > > > > > > > > > > the internal distribution of the
> work
> > > > >> between
> > > > >> > the
> > > > >> > >> > two
> > > > >> > >> > >> > > thread
> > > > >> > >> > >> > > > > > pools
> > > > >> > >> > >> > > > > > > in
> > > > >> > >> > >> > > > > > > > > > > Kafka.
> > > > >> > >> > >> > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > Regards,
> > > > >> > >> > >> > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > Rajini
> > > > >> > >> > >> > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM,
> Jun
> > > Rao
> > > > <
> > > > >> > >> > >> > > jun@confluent.io>
> > > > >> > >> > >> > > > > > > wrote:
> > > > >> > >> > >> > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > The benefit of using the request
> > > > >> processing
> > > > >> > >> time
> > > > >> > >> > >> over
> > > > >> > >> > >> > the
> > > > >> > >> > >> > > > > > request
> > > > >> > >> > >> > > > > > > > > rate
> > > > >> > >> > >> > > > > > > > > > is
> > > > >> > >> > >> > > > > > > > > > > > exactly what people have said. I
> > will
> > > > just
> > > > >> > >> expand
> > > > >> > >> > >> that
> > > > >> > >> > >> > a
> > > > >> > >> > >> > > > bit.
> > > > >> > >> > >> > > > > > > > > Consider
> > > > >> > >> > >> > > > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > > following case. The producer
> sends a
> > > > >> produce
> > > > >> > >> > request
> > > > >> > >> > >> > > with a
> > > > >> > >> > >> > > > > > 10MB
> > > > >> > >> > >> > > > > > > > > > message
> > > > >> > >> > >> > > > > > > > > > > > but compressed to 100KB with gzip.
> > The
> > > > >> > >> > >> decompression of
> > > > >> > >> > >> > > the
> > > > >> > >> > >> > > > > > > message
> > > > >> > >> > >> > > > > > > > > on
> > > > >> > >> > >> > > > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > > broker could take 10-15 seconds,
> > > during
> > > > >> which
> > > > >> > >> > time,
> > > > >> > >> > >> a
> > > > >> > >> > >> > > > request
> > > > >> > >> > >> > > > > > > > handler
> > > > >> > >> > >> > > > > > > > > > > > thread is completely blocked. In
> > this
> > > > >> case,
> > > > >> > >> > neither
> > > > >> > >> > >> the
> > > > >> > >> > >> > > > > byte-in
> > > > >> > >> > >> > > > > > > > quota
> > > > >> > >> > >> > > > > > > > > > nor
> > > > >> > >> > >> > > > > > > > > > > > the request rate quota may be
> > > effective
> > > > in
> > > > >> > >> > >> protecting
> > > > >> > >> > >> > the
> > > > >> > >> > >> > > > > > broker.
> > > > >> > >> > >> > > > > > > > > > > Consider
> > > > >> > >> > >> > > > > > > > > > > > another case. A consumer group
> > starts
> > > > >> with 10
> > > > >> > >> > >> instances
> > > > >> > >> > >> > > and
> > > > >> > >> > >> > > > > > later
> > > > >> > >> > >> > > > > > > > on
> > > > >> > >> > >> > > > > > > > > > > > switches to 20 instances. The
> > request
> > > > rate
> > > > >> > will
> > > > >> > >> > >> likely
> > > > >> > >> > >> > > > > double,
> > > > >> > >> > >> > > > > > > but
> > > > >> > >> > >> > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > > actually load on the broker may
> not
> > > > double
> > > > >> > >> since
> > > > >> > >> > >> each
> > > > >> > >> > >> > > fetch
> > > > >> > >> > >> > > > > > > request
> > > > >> > >> > >> > > > > > > > > > only
> > > > >> > >> > >> > > > > > > > > > > > contains half of the partitions.
> > > Request
> > > > >> rate
> > > > >> > >> > quota
> > > > >> > >> > >> may
> > > > >> > >> > >> > > not
> > > > >> > >> > >> > > > > be
> > > > >> > >> > >> > > > > > > easy
> > > > >> > >> > >> > > > > > > > > to
> > > > >> > >> > >> > > > > > > > > > > > configure in this case.
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > What we really want is to be able
> to
> > > > >> prevent
> > > > >> > a
> > > > >> > >> > >> client
> > > > >> > >> > >> > > from
> > > > >> > >> > >> > > > > > using
> > > > >> > >> > >> > > > > > > > too
> > > > >> > >> > >> > > > > > > > > > much
> > > > >> > >> > >> > > > > > > > > > > > of the server side resources. In
> > this
> > > > >> > >> particular
> > > > >> > >> > >> KIP,
> > > > >> > >> > >> > > this
> > > > >> > >> > >> > > > > > > resource
> > > > >> > >> > >> > > > > > > > > is
> > > > >> > >> > >> > > > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > > capacity of the request handler
> > > > threads. I
> > > > >> > >> agree
> > > > >> > >> > >> that
> > > > >> > >> > >> > it
> > > > >> > >> > >> > > > may
> > > > >> > >> > >> > > > > > not
> > > > >> > >> > >> > > > > > > be
> > > > >> > >> > >> > > > > > > > > > > > intuitive for the users to
> determine
> > > how
> > > > >> to
> > > > >> > set
> > > > >> > >> > the
> > > > >> > >> > >> > right
> > > > >> > >> > >> > > > > > limit.
> > > > >> > >> > >> > > > > > > > > > However,
> > > > >> > >> > >> > > > > > > > > > > > this is not completely new and has
> > > been
> > > > >> done
> > > > >> > in
> > > > >> > >> > the
> > > > >> > >> > >> > > > container
> > > > >> > >> > >> > > > > > > world
> > > > >> > >> > >> > > > > > > > > > > > already. For example, Linux
> cgroup (
> > > > >> > >> > >> > > > > https://access.redhat.com/
> > > > >> > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > > >> > >> > >> terprise_Linux/6/html/
> > > > >> > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-
> > > cpu.html)
> > > > >> has
> > > > >> > >> the
> > > > >> > >> > >> > concept
> > > > >> > >> > >> > > of
> > > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > >> > >> > >> > > > > > > > > > > > which specifies the total amount
> of
> > > time
> > > > >> in
> > > > >> > >> > >> > microseconds
> > > > >> > >> > >> > > > for
> > > > >> > >> > >> > > > > > > which
> > > > >> > >> > >> > > > > > > > > all
> > > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup can run during a
> > one
> > > > >> second
> > > > >> > >> > >> period.
> > > > >> > >> > >> > We
> > > > >> > >> > >> > > > can
> > > > >> > >> > >> > > > > > > > > > potentially
> > > > >> > >> > >> > > > > > > > > > > > model the request handler threads
> > in a
> > > > >> > similar
> > > > >> > >> > way.
> > > > >> > >> > >> For
> > > > >> > >> > >> > > > > > example,
> > > > >> > >> > >> > > > > > > > each
> > > > >> > >> > >> > > > > > > > > > > > request handler thread can be 1
> > > request
> > > > >> > handler
> > > > >> > >> > unit
> > > > >> > >> > >> > and
> > > > >> > >> > >> > > > the
> > > > >> > >> > >> > > > > > > admin
> > > > >> > >> > >> > > > > > > > > can
> > > > >> > >> > >> > > > > > > > > > > > configure a limit on how many
> units
> > > (say
> > > > >> > 0.01)
> > > > >> > >> a
> > > > >> > >> > >> client
> > > > >> > >> > >> > > can
> > > > >> > >> > >> > > > > > have.
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > Regarding not throttling the
> > internal
> > > > >> broker
> > > > >> > to
> > > > >> > >> > >> broker
> > > > >> > >> > >> > > > > > requests.
> > > > >> > >> > >> > > > > > > We
> > > > >> > >> > >> > > > > > > > > > could
> > > > >> > >> > >> > > > > > > > > > > > do that. Alternatively, we could
> > just
> > > > let
> > > > >> the
> > > > >> > >> > admin
> > > > >> > >> > >> > > > > configure a
> > > > >> > >> > >> > > > > > > > high
> > > > >> > >> > >> > > > > > > > > > > limit
> > > > >> > >> > >> > > > > > > > > > > > for the kafka user (it may not be
> > able
> > > > to
> > > > >> do
> > > > >> > >> that
> > > > >> > >> > >> > easily
> > > > >> > >> > >> > > > > based
> > > > >> > >> > >> > > > > > on
> > > > >> > >> > >> > > > > > > > > > > clientId
> > > > >> > >> > >> > > > > > > > > > > > though).
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > Ideally we want to be able to
> > protect
> > > > the
> > > > >> > >> > >> utilization
> > > > >> > >> > >> > of
> > > > >> > >> > >> > > > the
> > > > >> > >> > >> > > > > > > > network
> > > > >> > >> > >> > > > > > > > > > > thread
> > > > >> > >> > >> > > > > > > > > > > > pool too. The difficult is mostly
> > what
> > > > >> Rajini
> > > > >> > >> > said:
> > > > >> > >> > >> (1)
> > > > >> > >> > >> > > The
> > > > >> > >> > >> > > > > > > > mechanism
> > > > >> > >> > >> > > > > > > > > > for
> > > > >> > >> > >> > > > > > > > > > > > throttling the requests is through
> > > > >> Purgatory
> > > > >> > >> and
> > > > >> > >> > we
> > > > >> > >> > >> > will
> > > > >> > >> > >> > > > have
> > > > >> > >> > >> > > > > > to
> > > > >> > >> > >> > > > > > > > > think
> > > > >> > >> > >> > > > > > > > > > > > through how to integrate that into
> > the
> > > > >> > network
> > > > >> > >> > >> layer.
> > > > >> > >> > >> > > (2)
> > > > >> > >> > >> > > > In
> > > > >> > >> > >> > > > > > the
> > > > >> > >> > >> > > > > > > > > > network
> > > > >> > >> > >> > > > > > > > > > > > layer, currently we know the user,
> > but
> > > > not
> > > > >> > the
> > > > >> > >> > >> clientId
> > > > >> > >> > >> > > of
> > > > >> > >> > >> > > > > the
> > > > >> > >> > >> > > > > > > > > request.
> > > > >> > >> > >> > > > > > > > > > > So,
> > > > >> > >> > >> > > > > > > > > > > > it's a bit tricky to throttle
> based
> > on
> > > > >> > clientId
> > > > >> > >> > >> there.
> > > > >> > >> > >> > > > Plus,
> > > > >> > >> > >> > > > > > the
> > > > >> > >> > >> > > > > > > > > > byteOut
> > > > >> > >> > >> > > > > > > > > > > > quota can already protect the
> > network
> > > > >> thread
> > > > >> > >> > >> > utilization
> > > > >> > >> > >> > > > for
> > > > >> > >> > >> > > > > > > fetch
> > > > >> > >> > >> > > > > > > > > > > > requests. So, if we can't figure
> out
> > > > this
> > > > >> > part
> > > > >> > >> > right
> > > > >> > >> > >> > now,
> > > > >> > >> > >> > > > > just
> > > > >> > >> > >> > > > > > > > > focusing
> > > > >> > >> > >> > > > > > > > > > > on
> > > > >> > >> > >> > > > > > > > > > > > the request handling threads for
> > this
> > > > KIP
> > > > >> is
> > > > >> > >> > still a
> > > > >> > >> > >> > > useful
> > > > >> > >> > >> > > > > > > > feature.
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > Jun
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM,
> > > Rajini
> > > > >> > >> Sivaram <
> > > > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > >> > >> > >> > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > Thank you all for the feedback.
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > Jay: I have removed exemption
> for
> > > > >> consumer
> > > > >> > >> > >> heartbeat
> > > > >> > >> > >> > > etc.
> > > > >> > >> > >> > > > > > Agree
> > > > >> > >> > >> > > > > > > > > that
> > > > >> > >> > >> > > > > > > > > > > > > protecting the cluster is more
> > > > important
> > > > >> > than
> > > > >> > >> > >> > > protecting
> > > > >> > >> > >> > > > > > > > individual
> > > > >> > >> > >> > > > > > > > > > > apps.
> > > > >> > >> > >> > > > > > > > > > > > > Have retained the exemption for
> > > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > > >> > >> > >> > > > > > etc,
> > > > >> > >> > >> > > > > > > > > these
> > > > >> > >> > >> > > > > > > > > > > are
> > > > >> > >> > >> > > > > > > > > > > > > throttled only if authorization
> > > fails
> > > > >> (so
> > > > >> > >> can't
> > > > >> > >> > be
> > > > >> > >> > >> > used
> > > > >> > >> > >> > > > for
> > > > >> > >> > >> > > > > > DoS
> > > > >> > >> > >> > > > > > > > > > attacks
> > > > >> > >> > >> > > > > > > > > > > > in
> > > > >> > >> > >> > > > > > > > > > > > > a secure cluster, but allows
> > > > >> inter-broker
> > > > >> > >> > >> requests to
> > > > >> > >> > >> > > > > > complete
> > > > >> > >> > >> > > > > > > > > > without
> > > > >> > >> > >> > > > > > > > > > > > > delays).
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > I will wait another day to see
> if
> > > > these
> > > > >> is
> > > > >> > >> any
> > > > >> > >> > >> > > objection
> > > > >> > >> > >> > > > to
> > > > >> > >> > >> > > > > > > > quotas
> > > > >> > >> > >> > > > > > > > > > > based
> > > > >> > >> > >> > > > > > > > > > > > on
> > > > >> > >> > >> > > > > > > > > > > > > request processing time (as
> > opposed
> > > to
> > > > >> > >> request
> > > > >> > >> > >> rate)
> > > > >> > >> > >> > > and
> > > > >> > >> > >> > > > if
> > > > >> > >> > >> > > > > > > there
> > > > >> > >> > >> > > > > > > > > are
> > > > >> > >> > >> > > > > > > > > > > no
> > > > >> > >> > >> > > > > > > > > > > > > objections, I will revert to the
> > > > >> original
> > > > >> > >> > proposal
> > > > >> > >> > >> > with
> > > > >> > >> > >> > > > > some
> > > > >> > >> > >> > > > > > > > > changes.
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > The original proposal was only
> > > > including
> > > > >> > the
> > > > >> > >> > time
> > > > >> > >> > >> > used
> > > > >> > >> > >> > > by
> > > > >> > >> > >> > > > > the
> > > > >> > >> > >> > > > > > > > > request
> > > > >> > >> > >> > > > > > > > > > > > > handler threads (that made
> > > calculation
> > > > >> > >> easy). I
> > > > >> > >> > >> think
> > > > >> > >> > >> > > the
> > > > >> > >> > >> > > > > > > > > suggestion
> > > > >> > >> > >> > > > > > > > > > is
> > > > >> > >> > >> > > > > > > > > > > > to
> > > > >> > >> > >> > > > > > > > > > > > > include the time spent in the
> > > network
> > > > >> > >> threads as
> > > > >> > >> > >> well
> > > > >> > >> > >> > > > since
> > > > >> > >> > >> > > > > > > that
> > > > >> > >> > >> > > > > > > > > may
> > > > >> > >> > >> > > > > > > > > > be
> > > > >> > >> > >> > > > > > > > > > > > > significant. As Jay pointed out,
> > it
> > > is
> > > > >> more
> > > > >> > >> > >> > complicated
> > > > >> > >> > >> > > > to
> > > > >> > >> > >> > > > > > > > > calculate
> > > > >> > >> > >> > > > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > > > total available CPU time and
> > convert
> > > > to
> > > > >> a
> > > > >> > >> ratio
> > > > >> > >> > >> when
> > > > >> > >> > >> > > > there
> > > > >> > >> > >> > > > > > *m*
> > > > >> > >> > >> > > > > > > > I/O
> > > > >> > >> > >> > > > > > > > > > > > threads
> > > > >> > >> > >> > > > > > > > > > > > > and *n* network threads.
> > > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > > >> > >> > >> > > )
> > > > >> > >> > >> > > > > may
> > > > >> > >> > >> > > > > > > > give
> > > > >> > >> > >> > > > > > > > > us
> > > > >> > >> > >> > > > > > > > > > > > what
> > > > >> > >> > >> > > > > > > > > > > > > we want, but it can be very
> > > expensive
> > > > on
> > > > >> > some
> > > > >> > >> > >> > > platforms.
> > > > >> > >> > >> > > > As
> > > > >> > >> > >> > > > > > > > Becket
> > > > >> > >> > >> > > > > > > > > > and
> > > > >> > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do
> > > have
> > > > >> > several
> > > > >> > >> > time
> > > > >> > >> > >> > > > > > measurements
> > > > >> > >> > >> > > > > > > > > > already
> > > > >> > >> > >> > > > > > > > > > > > for
> > > > >> > >> > >> > > > > > > > > > > > > generating metrics that we could
> > > use,
> > > > >> > though
> > > > >> > >> we
> > > > >> > >> > >> might
> > > > >> > >> > >> > > > want
> > > > >> > >> > >> > > > > to
> > > > >> > >> > >> > > > > > > > > switch
> > > > >> > >> > >> > > > > > > > > > to
> > > > >> > >> > >> > > > > > > > > > > > > nanoTime() instead of
> > > > >> currentTimeMillis()
> > > > >> > >> since
> > > > >> > >> > >> some
> > > > >> > >> > >> > of
> > > > >> > >> > >> > > > the
> > > > >> > >> > >> > > > > > > > values
> > > > >> > >> > >> > > > > > > > > > for
> > > > >> > >> > >> > > > > > > > > > > > > small requests may be < 1ms. But
> > > > rather
> > > > >> > than
> > > > >> > >> add
> > > > >> > >> > >> up
> > > > >> > >> > >> > the
> > > > >> > >> > >> > > > > time
> > > > >> > >> > >> > > > > > > > spent
> > > > >> > >> > >> > > > > > > > > in
> > > > >> > >> > >> > > > > > > > > > > I/O
> > > > >> > >> > >> > > > > > > > > > > > > thread and network thread,
> > wouldn't
> > > it
> > > > >> be
> > > > >> > >> better
> > > > >> > >> > >> to
> > > > >> > >> > >> > > > convert
> > > > >> > >> > >> > > > > > the
> > > > >> > >> > >> > > > > > > > > time
> > > > >> > >> > >> > > > > > > > > > > > spent
> > > > >> > >> > >> > > > > > > > > > > > > on each thread into a separate
> > > ratio?
> > > > >> UserA
> > > > >> > >> has
> > > > >> > >> > a
> > > > >> > >> > >> > > request
> > > > >> > >> > >> > > > > > quota
> > > > >> > >> > >> > > > > > > > of
> > > > >> > >> > >> > > > > > > > > > 5%.
> > > > >> > >> > >> > > > > > > > > > > > Can
> > > > >> > >> > >> > > > > > > > > > > > > we take that to mean that UserA
> > can
> > > > use
> > > > >> 5%
> > > > >> > of
> > > > >> > >> > the
> > > > >> > >> > >> > time
> > > > >> > >> > >> > > on
> > > > >> > >> > >> > > > > > > network
> > > > >> > >> > >> > > > > > > > > > > threads
> > > > >> > >> > >> > > > > > > > > > > > > and 5% of the time on I/O
> threads?
> > > If
> > > > >> > either
> > > > >> > >> is
> > > > >> > >> > >> > > exceeded,
> > > > >> > >> > >> > > > > the
> > > > >> > >> > >> > > > > > > > > > response
> > > > >> > >> > >> > > > > > > > > > > is
> > > > >> > >> > >> > > > > > > > > > > > > throttled - it would mean
> > > maintaining
> > > > >> two
> > > > >> > >> sets
> > > > >> > >> > of
> > > > >> > >> > >> > > metrics
> > > > >> > >> > >> > > > > for
> > > > >> > >> > >> > > > > > > the
> > > > >> > >> > >> > > > > > > > > two
> > > > >> > >> > >> > > > > > > > > > > > > durations, but would result in
> > more
> > > > >> > >> meaningful
> > > > >> > >> > >> > ratios.
> > > > >> > >> > >> > > We
> > > > >> > >> > >> > > > > > could
> > > > >> > >> > >> > > > > > > > > > define
> > > > >> > >> > >> > > > > > > > > > > > two
> > > > >> > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of
> > > request
> > > > >> > threads
> > > > >> > >> > and
> > > > >> > >> > >> 10%
> > > > >> > >> > >> > > of
> > > > >> > >> > >> > > > > > > network
> > > > >> > >> > >> > > > > > > > > > > > threads),
> > > > >> > >> > >> > > > > > > > > > > > > but that seems unnecessary and
> > > harder
> > > > to
> > > > >> > >> explain
> > > > >> > >> > >> to
> > > > >> > >> > >> > > > users.
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > Back to why and how quotas are
> > > applied
> > > > >> to
> > > > >> > >> > network
> > > > >> > >> > >> > > thread
> > > > >> > >> > >> > > > > > > > > utilization:
> > > > >> > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the
> time
> > > > >> spent in
> > > > >> > >> the
> > > > >> > >> > >> > network
> > > > >> > >> > >> > > > > > thread
> > > > >> > >> > >> > > > > > > > may
> > > > >> > >> > >> > > > > > > > > be
> > > > >> > >> > >> > > > > > > > > > > > > significant and I can see the
> need
> > > to
> > > > >> > include
> > > > >> > >> > >> this.
> > > > >> > >> > >> > Are
> > > > >> > >> > >> > > > > there
> > > > >> > >> > >> > > > > > > > other
> > > > >> > >> > >> > > > > > > > > > > > > requests where the network
> thread
> > > > >> > >> utilization is
> > > > >> > >> > >> > > > > significant?
> > > > >> > >> > >> > > > > > > In
> > > > >> > >> > >> > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > case
> > > > >> > >> > >> > > > > > > > > > > > > of fetch, request handler thread
> > > > >> > utilization
> > > > >> > >> > would
> > > > >> > >> > >> > > > throttle
> > > > >> > >> > >> > > > > > > > clients
> > > > >> > >> > >> > > > > > > > > > > with
> > > > >> > >> > >> > > > > > > > > > > > > high request rate, low data
> volume
> > > and
> > > > >> > fetch
> > > > >> > >> > byte
> > > > >> > >> > >> > rate
> > > > >> > >> > >> > > > > quota
> > > > >> > >> > >> > > > > > > will
> > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > >> > >> > >> > > > > > > > > > > > > clients with high data volume.
> > > Network
> > > > >> > thread
> > > > >> > >> > >> > > utilization
> > > > >> > >> > >> > > > > is
> > > > >> > >> > >> > > > > > > > > perhaps
> > > > >> > >> > >> > > > > > > > > > > > > proportional to the data
> volume. I
> > > am
> > > > >> > >> wondering
> > > > >> > >> > >> if we
> > > > >> > >> > >> > > > even
> > > > >> > >> > >> > > > > > need
> > > > >> > >> > >> > > > > > > > to
> > > > >> > >> > >> > > > > > > > > > > > throttle
> > > > >> > >> > >> > > > > > > > > > > > > based on network thread
> > utilization
> > > or
> > > > >> > >> whether
> > > > >> > >> > the
> > > > >> > >> > >> > data
> > > > >> > >> > >> > > > > > volume
> > > > >> > >> > >> > > > > > > > > quota
> > > > >> > >> > >> > > > > > > > > > > > covers
> > > > >> > >> > >> > > > > > > > > > > > > this case.
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > b) At the moment, we record and
> > > check
> > > > >> for
> > > > >> > >> quota
> > > > >> > >> > >> > > violation
> > > > >> > >> > >> > > > > at
> > > > >> > >> > >> > > > > > > the
> > > > >> > >> > >> > > > > > > > > same
> > > > >> > >> > >> > > > > > > > > > > > time.
> > > > >> > >> > >> > > > > > > > > > > > > If a quota is violated, the
> > response
> > > > is
> > > > >> > >> delayed.
> > > > >> > >> > >> > Using
> > > > >> > >> > >> > > > > Jay'e
> > > > >> > >> > >> > > > > > > > > example
> > > > >> > >> > >> > > > > > > > > > of
> > > > >> > >> > >> > > > > > > > > > > > > disk reads for fetches happening
> > in
> > > > the
> > > > >> > >> network
> > > > >> > >> > >> > thread,
> > > > >> > >> > >> > > > We
> > > > >> > >> > >> > > > > > > can't
> > > > >> > >> > >> > > > > > > > > > record
> > > > >> > >> > >> > > > > > > > > > > > and
> > > > >> > >> > >> > > > > > > > > > > > > delay a response after the disk
> > > reads.
> > > > >> We
> > > > >> > >> could
> > > > >> > >> > >> > record
> > > > >> > >> > >> > > > the
> > > > >> > >> > >> > > > > > time
> > > > >> > >> > >> > > > > > > > > spent
> > > > >> > >> > >> > > > > > > > > > > on
> > > > >> > >> > >> > > > > > > > > > > > > the network thread when the
> > response
> > > > is
> > > > >> > >> complete
> > > > >> > >> > >> and
> > > > >> > >> > >> > > > > > introduce
> > > > >> > >> > >> > > > > > > a
> > > > >> > >> > >> > > > > > > > > > delay
> > > > >> > >> > >> > > > > > > > > > > > for
> > > > >> > >> > >> > > > > > > > > > > > > handling a subsequent request
> > > > (separate
> > > > >> out
> > > > >> > >> > >> recording
> > > > >> > >> > >> > > and
> > > > >> > >> > >> > > > > > quota
> > > > >> > >> > >> > > > > > > > > > > violation
> > > > >> > >> > >> > > > > > > > > > > > > handling in the case of network
> > > thread
> > > > >> > >> > overload).
> > > > >> > >> > >> > Does
> > > > >> > >> > >> > > > that
> > > > >> > >> > >> > > > > > > make
> > > > >> > >> > >> > > > > > > > > > sense?
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM,
> > > > Becket
> > > > >> > Qin <
> > > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > > >> > >> > >> > > > > > > > > > > > wrote:
> > > > >> > >> > >> > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > > >> > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing
> the
> > > CPU
> > > > >> time
> > > > >> > >> is a
> > > > >> > >> > >> > little
> > > > >> > >> > >> > > > > > > tricky. I
> > > > >> > >> > >> > > > > > > > > am
> > > > >> > >> > >> > > > > > > > > > > > > thinking
> > > > >> > >> > >> > > > > > > > > > > > > > that maybe we can use the
> > existing
> > > > >> > request
> > > > >> > >> > >> > > statistics.
> > > > >> > >> > >> > > > > They
> > > > >> > >> > >> > > > > > > are
> > > > >> > >> > >> > > > > > > > > > > already
> > > > >> > >> > >> > > > > > > > > > > > > > very detailed so we can
> probably
> > > see
> > > > >> the
> > > > >> > >> > >> > approximate
> > > > >> > >> > >> > > > CPU
> > > > >> > >> > >> > > > > > time
> > > > >> > >> > >> > > > > > > > > from
> > > > >> > >> > >> > > > > > > > > > > it,
> > > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > > >> > >> > >> > > > > > > > > > > > > > something like (total_time -
> > > > >> > >> > >> > > > request/response_queue_time
> > > > >> > >> > >> > > > > -
> > > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > > >> > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > > I agree with Guozhang that
> when
> > a
> > > > >> user is
> > > > >> > >> > >> throttled
> > > > >> > >> > >> > > it
> > > > >> > >> > >> > > > is
> > > > >> > >> > >> > > > > > > > likely
> > > > >> > >> > >> > > > > > > > > > that
> > > > >> > >> > >> > > > > > > > > > > > we
> > > > >> > >> > >> > > > > > > > > > > > > > need to see if anything has
> went
> > > > wrong
> > > > >> > >> first,
> > > > >> > >> > >> and
> > > > >> > >> > >> > if
> > > > >> > >> > >> > > > the
> > > > >> > >> > >> > > > > > > users
> > > > >> > >> > >> > > > > > > > > are
> > > > >> > >> > >> > > > > > > > > > > well
> > > > >> > >> > >> > > > > > > > > > > > > > behaving and just need more
> > > > >> resources, we
> > > > >> > >> will
> > > > >> > >> > >> have
> > > > >> > >> > >> > > to
> > > > >> > >> > >> > > > > bump
> > > > >> > >> > >> > > > > > > up
> > > > >> > >> > >> > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > > quota
> > > > >> > >> > >> > > > > > > > > > > > > > for them. It is true that
> > > > >> pre-allocating
> > > > >> > >> CPU
> > > > >> > >> > >> time
> > > > >> > >> > >> > > quota
> > > > >> > >> > >> > > > > > > > precisely
> > > > >> > >> > >> > > > > > > > > > for
> > > > >> > >> > >> > > > > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > > > > users is difficult. So in
> > practice
> > > > it
> > > > >> > would
> > > > >> > >> > >> > probably
> > > > >> > >> > >> > > be
> > > > >> > >> > >> > > > > > more
> > > > >> > >> > >> > > > > > > > like
> > > > >> > >> > >> > > > > > > > > > > first
> > > > >> > >> > >> > > > > > > > > > > > > set
> > > > >> > >> > >> > > > > > > > > > > > > > a relative high protective CPU
> > > time
> > > > >> quota
> > > > >> > >> for
> > > > >> > >> > >> > > everyone
> > > > >> > >> > >> > > > > and
> > > > >> > >> > >> > > > > > > > > increase
> > > > >> > >> > >> > > > > > > > > > > > that
> > > > >> > >> > >> > > > > > > > > > > > > > for some individual clients on
> > > > demand.
> > > > >> > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > > >> > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > >> > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48
> PM,
> > > > >> Guozhang
> > > > >> > >> > Wang <
> > > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > > >> > >> > >> > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > > >> > >> > >> > > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > > > This is a great proposal,
> glad
> > > to
> > > > >> see
> > > > >> > it
> > > > >> > >> > >> > happening.
> > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU
> > > > >> throttling, or
> > > > >> > >> more
> > > > >> > >> > >> > > > > specifically
> > > > >> > >> > >> > > > > > > > > > > processing
> > > > >> > >> > >> > > > > > > > > > > > > time
> > > > >> > >> > >> > > > > > > > > > > > > > > ratio instead of the request
> > > rate
> > > > >> > >> throttling
> > > > >> > >> > >> as
> > > > >> > >> > >> > > well.
> > > > >> > >> > >> > > > > > > Becket
> > > > >> > >> > >> > > > > > > > > has
> > > > >> > >> > >> > > > > > > > > > > very
> > > > >> > >> > >> > > > > > > > > > > > > > well
> > > > >> > >> > >> > > > > > > > > > > > > > > summed my rationales above,
> > and
> > > > one
> > > > >> > >> thing to
> > > > >> > >> > >> add
> > > > >> > >> > >> > > here
> > > > >> > >> > >> > > > > is
> > > > >> > >> > >> > > > > > > that
> > > > >> > >> > >> > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > > > former
> > > > >> > >> > >> > > > > > > > > > > > > > > has a good support for both
> > > > >> "protecting
> > > > >> > >> > >> against
> > > > >> > >> > >> > > rogue
> > > > >> > >> > >> > > > > > > > clients"
> > > > >> > >> > >> > > > > > > > > as
> > > > >> > >> > >> > > > > > > > > > > > well
> > > > >> > >> > >> > > > > > > > > > > > > as
> > > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> > > > >> multi-tenancy
> > > > >> > >> > usage":
> > > > >> > >> > >> > when
> > > > >> > >> > >> > > > > > > thinking
> > > > >> > >> > >> > > > > > > > > > about
> > > > >> > >> > >> > > > > > > > > > > > how
> > > > >> > >> > >> > > > > > > > > > > > > to
> > > > >> > >> > >> > > > > > > > > > > > > > > explain this to the end
> > users, I
> > > > >> find
> > > > >> > it
> > > > >> > >> > >> actually
> > > > >> > >> > >> > > > more
> > > > >> > >> > >> > > > > > > > natural
> > > > >> > >> > >> > > > > > > > > > than
> > > > >> > >> > >> > > > > > > > > > > > the
> > > > >> > >> > >> > > > > > > > > > > > > > > request rate since as
> > mentioned
> > > > >> above,
> > > > >> > >> > >> different
> > > > >> > >> > >> > > > > requests
> > > > >> > >> > >> > > > > > > > will
> > > > >> > >> > >> > > > > > > > > > have
> > > > >> > >> > >> > > > > > > > > > > > > quite
> > > > >> > >> > >> > > > > > > > > > > > > > > different "cost", and Kafka
> > > today
> > > > >> > already
> > > > >> > >> > have
> > > > >> > >> > >> > > > various
> > > > >> > >> > >> > > > > > > > request
> > > > >> > >> > >> > > > > > > > > > > types
> > > > >> > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin,
> > > metadata,
> > > > >> etc),
> > > > >> > >> > >> because
> > > > >> > >> > >> > of
> > > > >> > >> > >> > > > that
> > > > >> > >> > >> > > > > > the
> > > > >> > >> > >> > > > > > > > > > request
> > > > >> > >> > >> > > > > > > > > > > > > rate
> > > > >> > >> > >> > > > > > > > > > > > > > > throttling may not be as
> > > effective
> > > > >> > >> unless it
> > > > >> > >> > >> is
> > > > >> > >> > >> > set
> > > > >> > >> > >> > > > > very
> > > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > > >> > >> > >> > > > > > > > > > > > > > >
> > > > >> > >> > >> > > > > > > > > > > > > > > Regarding to user reactions
> > when
> > > > >> they
> > > > >> > are
> > > > >> > >> > >> > > throttled,
> > > > >> > >> > >> > > > I
> > > > >> > >> > >> > > > > > > think
> > > > >> > >> > >> > > > > > > > it
> > > > >> > >> > >> > > > > > > > > > may
> > > > >> > >> > >> > > > > > > > > > > > > > differ
> > > > >> > >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
> > > > >> > discovered /
> > > > >> > >> > >> guided
> > > > >> > >> > >> > by
> > > > >> > >> > >> > > > > > looking
> > > > >> > >> > >> > > > > > > > at
> > > > >> > >> > >> > > > > > > > > > > > relative
> > > > >> > >> > >> > > > > > > > > > > > > > > metrics. So in other words
> > users
> > > > >> would
> > > > >> > >> not
> > > > >> > >> > >> expect
> > > > >> > >> > >> > > to
> > > > >> > >> > >> > > > > get
> > > > >> > >> > >> > > > > > > > > > additional
> > > > >> > >> > >> > > > > > > > > > > > > > > information by simply being
> > told
> > > > >> "hey,
> > > > >> > >> you
> > > > >> > >> > are
> > > > >> > >> > >> > > > > > throttled",
> > > > >> > >> > >> > > > > > > > > which
> > > > >> > >> > >> > > > > > > > > > is
> > > > >> > >> > >> > > > > > > > > > > > all
> > > > >> > >> > >> > > > > > > > > > > > > > > what throttling does; they
> > need
> > > to
> > > > >> > take a
> > > > >> > >> > >> > follow-up
> > > > >> > >> > >> > > > > step
> > > > >> > >> > >> > > > > > > and
> > > > >> > >> > >> > > > > > > > > see
> > > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > > >> > >> > >> > > > > > > > > > > > > > > throttled probably because
> of
> > > ..",
> > > > >> > which
> > > > >> > >> is
> > > > >> > >> > by
> > > > >> > >> > >> > > > looking
> > > > >> > >> > >> > > > > at
> > > > >> > >> > >> > > > > > > > other
> > > > >> > >> > >> > > > > > > > > > > > metric
> > > > >> > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm
> > > > bombarding
> > > > >> the
> > > > >> > >> > >> brokers
> > > > >> > >> > >> > > with
> > > > >> > >> > >> > > > >
> > > > >>
> > > > > ...
> > > > >
> > > > > [Message clipped]
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Jun,

Agree about the two scenarios.

But still not sure about a single quota covering both network threads and
I/O threads with per-thread quota. If there are 10 I/O threads and 5
network threads and I want to assign half the quota to userA, the quota
would be 750%. I imagine, internally, we would convert this to 500% for I/O
and 250% for network threads to allocate 50% of each pool.

A couple of scenarios:

1. Admin adds 1 extra network thread. To retain 50%, admin needs to now
allocate 800% for each user. Or increase the quota for a few users. To me,
it feels like admin needs to convert 50% to 800% and Kafka internally needs
to convert 800% to (500%, 300%). Everyone using just 50% feels a lot
simpler.

2. We decide to add some other thread to this list. Admin needs to know
exactly how many threads form the maximum quota. And we can be changing
this between broker versions as we add more to the list. Again a single
overall percent would be a lot simpler.

There were others who were unconvinced by a single percent from the initial
proposal and were happier with thread units similar to CPU units, so I am
ok with going with per-thread quotas (as units or percent). Just not sure
it makes it easier for admin in all cases.

Regards,

Rajini


On Fri, Mar 3, 2017 at 6:03 AM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Rajini,
>
> Consider modeling as n * 100% unit. For 2), the question is what's causing
> the I/O threads to be saturated. It's unlikely that all users' utilization
> have increased at the same. A more likely case is that a few isolated
> users' utilization have increased. If so, after increasing the number of
> threads, the admin just needs to adjust the quota for a few isolated users,
> which is expected and is less work.
>
> Consider modeling as 1 * 100% unit. For 1), all users' quota need to be
> adjusted, which is unexpected and is more work.
>
> So, to me, the n * 100% model seems more convenient.
>
> As for future extension to cover network thread utilization, I was thinking
> that one way is to simply model the capacity as (n + m) * 100% unit, where
> n and m are the number of network and i/o threads, respectively. Then, for
> each user, we can just add up the utilization in the network and the i/o
> thread. If we do this, we don't need a new type of quota.
>
> Thanks,
>
> Jun
>
>
> On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Jun,
> >
> > If we use request.percentage as the percentage used in a single I/O
> thread,
> > the total percentage being allocated will be num.io.threads * 100 for I/O
> > threads and num.network.threads * 100 for network threads. A single quota
> > covering the two as a percentage wouldn't quite work if you want to
> > allocate the same proportion in both cases. If we want to treat threads
> as
> > separate units, won't we need two quota configurations regardless of
> > whether we use units or percentage? Perhaps I misunderstood your
> > suggestion.
> >
> > I think there are two cases:
> >
> >    1. The use case that you mentioned where an admin is adding more users
> >    and decides to add more I/O threads and expects to find free quota to
> >    allocate for new users.
> >    2. Admin adds more I/O threads because the I/O threads are saturated
> and
> >    there are cores available to allocate, even though the number or
> >    users/clients hasn't changed.
> >
> > If we allocated treated I/O threads as a single unit of 100%, all user
> > quotas need to be reallocated for 1). If we allocated I/O threads as n
> > units with n*100%, all user quotas need to be reallocated for 2),
> otherwise
> > some of the new threads may just not be used. Either way it should be
> easy
> > to write a script to decrease/increase quotas by a multiple for all
> users.
> >
> > So it really boils down to which quota unit is most intuitive in terms of
> > configuration. And from the discussion so far, it feels like opinion is
> > divided on whether quotas should be carved out of an absolute 100% (or 1
> > unit) or be relative to the number of threads (n*100% or n units).
> >
> >
> >
> > On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Another way to express an absolute limit is to use request.percentage,
> > but
> > > treat it as the percentage used in a single request handling thread.
> For
> > > now, the request handling threads can be just the io threads. In the
> > > future, they can cover the network threads as well. This is similar to
> > how
> > > top reports CPU usage and may be a bit easier for people to understand.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Jay,
> > > >
> > > > 2. Regarding request.unit vs request.percentage. I started with
> > > > request.percentage too. The reasoning for request.unit is the
> > following.
> > > > Suppose that the capacity has been reached on a broker and the admin
> > > needs
> > > > to add a new user. A simple way to increase the capacity is to
> increase
> > > the
> > > > number of io threads, assuming there are still enough cores. If the
> > limit
> > > > is based on percentage, the additional capacity automatically gets
> > > > distributed to existing users and we haven't really carved out any
> > > > additional resource for the new user. Now, is it easy for a user to
> > > reason
> > > > about 0.1 unit vs 10%. My feeling is that both are hard and have to
> be
> > > > configured empirically. Not sure if percentage is obviously easier to
> > > > reason about.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io> wrote:
> > > >
> > > >> A couple of quick points:
> > > >>
> > > >> 1. Even though the implementation of this quota is only using io
> > thread
> > > >> time, i think we should call it something like "request-time". This
> > will
> > > >> give us flexibility to improve the implementation to cover network
> > > threads
> > > >> in the future and will avoid exposing internal details like our
> thread
> > > >> pools on the server.
> > > >>
> > > >> 2. Jun/Roger, I get what you are trying to fix but the idea of
> > > >> thread/units
> > > >> is super unintuitive as a user-facing knob. I had to read the KIP
> like
> > > >> eight times to understand this. I'm not sure that your point that
> > > >> increasing the number of threads is a problem with a
> percentage-based
> > > >> value, it really depends on whether the user thinks about the
> > > "percentage
> > > >> of request processing time" or "thread units". If they think "I have
> > > >> allocated 10% of my request processing time to user x" then it is a
> > bug
> > > >> that increasing the thread count decreases that percent as it does
> in
> > > the
> > > >> current proposal. As a practical matter I think the only way to
> > actually
> > > >> reason about this is as a percent---I just don't believe people are
> > > going
> > > >> to think, "ah, 4.3 thread units, that is the right amount!".
> Instead I
> > > >> think they have to understand this thread unit concept, figure out
> > what
> > > >> they have set in number of threads, compute a percent and then come
> up
> > > >> with
> > > >> the number of thread units, and these will all be wrong if that
> thread
> > > >> count changes. I also think this ties us to throttling the I/O
> thread
> > > >> pool,
> > > >> which may not be where we want to end up.
> > > >>
> > > >> 3. For what it's worth I do think having a single throttle_ms field
> in
> > > all
> > > >> the responses that combines all throttling from all quotas is
> probably
> > > the
> > > >> simplest. There could be a use case for having separate fields for
> > each,
> > > >> but I think that is actually harder to use/monitor in the common
> case
> > so
> > > >> unless someone has a use case I think just one should be fine.
> > > >>
> > > >> -Jay
> > > >>
> > > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > > rajinisivaram@gmail.com>
> > > >> wrote:
> > > >>
> > > >> > I have updated the KIP based on the discussions so far.
> > > >> >
> > > >> >
> > > >> > Regards,
> > > >> >
> > > >> > Rajini
> > > >> >
> > > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > > >> rajinisivaram@gmail.com>
> > > >> > wrote:
> > > >> >
> > > >> > > Thank you all for the feedback.
> > > >> > >
> > > >> > > Ismael #1. It makes sense not to throttle inter-broker requests
> > like
> > > >> > > LeaderAndIsr etc. The simplest way to ensure that clients cannot
> > use
> > > >> > these
> > > >> > > requests to bypass quotas for DoS attacks is to ensure that ACLs
> > > >> prevent
> > > >> > > clients from using these requests and unauthorized requests are
> > > >> included
> > > >> > > towards quotas.
> > > >> > >
> > > >> > > Ismael #2, Jay #1 : I was thinking that these quotas can return
> a
> > > >> > separate
> > > >> > > throttle time, and all utilization based quotas could use the
> same
> > > >> field
> > > >> > > (we won't add another one for network thread utilization for
> > > >> instance).
> > > >> > But
> > > >> > > perhaps it makes sense to keep byte rate quotas separate in
> > > >> produce/fetch
> > > >> > > responses to provide separate metrics? Agree with Ismael that
> the
> > > >> name of
> > > >> > > the existing field should be changed if we have two. Happy to
> > switch
> > > >> to a
> > > >> > > single combined throttle time if that is sufficient.
> > > >> > >
> > > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot separated name
> > for
> > > >> new
> > > >> > > property. Replication quotas use dot separated, so it will be
> > > >> consistent
> > > >> > > with all properties except byte rate quotas.
> > > >> > >
> > > >> > > Radai: #1 Request processing time rather than request rate were
> > > chosen
> > > >> > > because the time per request can vary significantly between
> > requests
> > > >> as
> > > >> > > mentioned in the discussion and KIP.
> > > >> > > #2 Two separate quotas for heartbeats/regular requests feel like
> > > more
> > > >> > > configuration and more metrics. Since most users would set
> quotas
> > > >> higher
> > > >> > > than the expected usage and quotas are more of a safety net, a
> > > single
> > > >> > quota
> > > >> > > should work in most cases.
> > > >> > >  #3 The number of requests in purgatory is limited by the number
> > of
> > > >> > active
> > > >> > > connections since only one request per connection will be
> > throttled
> > > >> at a
> > > >> > > time.
> > > >> > > #4 As with byte rate quotas, to use the full allocated quotas,
> > > >> > > clients/users would need to use partitions that are distributed
> > > across
> > > >> > the
> > > >> > > cluster. The alternative of using cluster-wide quotas instead of
> > > >> > per-broker
> > > >> > > quotas would be far too complex to implement.
> > > >> > >
> > > >> > > Dong : We currently have two ClientQuotaManagers for quota types
> > > Fetch
> > > >> > and
> > > >> > > Produce. A new one will be added for IOThread, which manages
> > quotas
> > > >> for
> > > >> > I/O
> > > >> > > thread utilization. This will not update the Fetch or Produce
> > > >> queue-size,
> > > >> > > but will have a separate metric for the queue-size.  I wasn't
> > > >> planning to
> > > >> > > add any additional metrics apart from the equivalent ones for
> > > existing
> > > >> > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
> > > >> utilization
> > > >> > > could be slightly misleading since it depends on the sequence of
> > > >> > requests.
> > > >> > > But we can look into more metrics after the KIP is implemented
> if
> > > >> > required.
> > > >> > >
> > > >> > > I think we need to limit the maximum delay since all requests
> are
> > > >> > > throttled. If a client has a quota of 0.001 units and a single
> > > request
> > > >> > used
> > > >> > > 50ms, we don't want to delay all requests from the client by 50
> > > >> seconds,
> > > >> > > throwing the client out of all its consumer groups. The issue is
> > > only
> > > >> if
> > > >> > a
> > > >> > > user is allocated a quota that is insufficient to process one
> > large
> > > >> > > request. The expectation is that the units allocated per user
> will
> > > be
> > > >> > much
> > > >> > > higher than the time taken to process one request and the limit
> > > should
> > > >> > > seldom be applied. Agree this needs proper documentation.
> > > >> > >
> > > >> > > Regards,
> > > >> > >
> > > >> > > Rajini
> > > >> > >
> > > >> > >
> > > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> > radai.rosenblatt@gmail.com>
> > > >> > wrote:
> > > >> > >
> > > >> > >> @jun: i wasnt concerned about tying up a request processing
> > thread,
> > > >> but
> > > >> > >> IIUC the code does still read the entire request out, which
> might
> > > >> add-up
> > > >> > >> to
> > > >> > >> a non-negligible amount of memory.
> > > >> > >>
> > > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <
> lindong28@gmail.com>
> > > >> wrote:
> > > >> > >>
> > > >> > >> > Hey Rajini,
> > > >> > >> >
> > > >> > >> > The current KIP says that the maximum delay will be reduced
> to
> > > >> window
> > > >> > >> size
> > > >> > >> > if it is larger than the window size. I have a concern with
> > this:
> > > >> > >> >
> > > >> > >> > 1) This essentially means that the user is allowed to exceed
> > > their
> > > >> > quota
> > > >> > >> > over a long period of time. Can you provide an upper bound on
> > > this
> > > >> > >> > deviation?
> > > >> > >> >
> > > >> > >> > 2) What is the motivation for cap the maximum delay by the
> > window
> > > >> > size?
> > > >> > >> I
> > > >> > >> > am wondering if there is better alternative to address the
> > > problem.
> > > >> > >> >
> > > >> > >> > 3) It means that the existing metric-related config will
> have a
> > > >> more
> > > >> > >> > directly impact on the mechanism of this io-thread-unit-based
> > > >> quota.
> > > >> > The
> > > >> > >> > may be an important change depending on the answer to 1)
> above.
> > > We
> > > >> > >> probably
> > > >> > >> > need to document this more explicitly.
> > > >> > >> >
> > > >> > >> > Dong
> > > >> > >> >
> > > >> > >> >
> > > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> > lindong28@gmail.com>
> > > >> > wrote:
> > > >> > >> >
> > > >> > >> > > Hey Jun,
> > > >> > >> > >
> > > >> > >> > > Yeah you are right. I thought it wasn't because at LinkedIn
> > it
> > > >> will
> > > >> > be
> > > >> > >> > too
> > > >> > >> > > much pressure on inGraph to expose those per-clientId
> metrics
> > > so
> > > >> we
> > > >> > >> ended
> > > >> > >> > > up printing them periodically to local log. Never mind if
> it
> > is
> > > >> not
> > > >> > a
> > > >> > >> > > general problem.
> > > >> > >> > >
> > > >> > >> > > Hey Rajini,
> > > >> > >> > >
> > > >> > >> > > - I agree with Jay that we probably don't want to add a new
> > > field
> > > >> > for
> > > >> > >> > > every quota ProduceResponse or FetchResponse. Is there any
> > > >> use-case
> > > >> > >> for
> > > >> > >> > > having separate throttle-time fields for byte-rate-quota
> and
> > > >> > >> > > io-thread-unit-quota? You probably need to document this as
> > > >> > interface
> > > >> > >> > > change if you plan to add new field in any request.
> > > >> > >> > >
> > > >> > >> > > - I don't think IOThread belongs to quotaType. The existing
> > > quota
> > > >> > >> types
> > > >> > >> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication)
> > > >> identify
> > > >> > >> the
> > > >> > >> > > type of request that are throttled, not the quota mechanism
> > > that
> > > >> is
> > > >> > >> > applied.
> > > >> > >> > >
> > > >> > >> > > - If a request is throttled due to this
> io-thread-unit-based
> > > >> quota,
> > > >> > is
> > > >> > >> > the
> > > >> > >> > > existing queue-size metric in ClientQuotaManager
> incremented?
> > > >> > >> > >
> > > >> > >> > > - In the interest of providing guide line for admin to
> decide
> > > >> > >> > > io-thread-unit-based quota and for user to understand its
> > > impact
> > > >> on
> > > >> > >> their
> > > >> > >> > > traffic, would it be useful to have a metric that shows the
> > > >> overall
> > > >> > >> > > byte-rate per io-thread-unit? Can we also show this a
> > > >> per-clientId
> > > >> > >> > metric?
> > > >> > >> > >
> > > >> > >> > > Thanks,
> > > >> > >> > > Dong
> > > >> > >> > >
> > > >> > >> > >
> > > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <jun@confluent.io
> >
> > > >> wrote:
> > > >> > >> > >
> > > >> > >> > >> Hi, Ismael,
> > > >> > >> > >>
> > > >> > >> > >> For #3, typically, an admin won't configure more io
> threads
> > > than
> > > >> > CPU
> > > >> > >> > >> cores,
> > > >> > >> > >> but it's possible for an admin to start with fewer io
> > threads
> > > >> than
> > > >> > >> cores
> > > >> > >> > >> and grow that later on.
> > > >> > >> > >>
> > > >> > >> > >> Hi, Dong,
> > > >> > >> > >>
> > > >> > >> > >> I think the throttleTime sensor on the broker tells the
> > admin
> > > >> > >> whether a
> > > >> > >> > >> user/clentId is throttled or not.
> > > >> > >> > >>
> > > >> > >> > >> Hi, Radi,
> > > >> > >> > >>
> > > >> > >> > >> The reasoning for delaying the throttled requests on the
> > > broker
> > > >> > >> instead
> > > >> > >> > of
> > > >> > >> > >> returning an error immediately is that the latter has no
> way
> > > to
> > > >> > >> prevent
> > > >> > >> > >> the
> > > >> > >> > >> client from retrying immediately, which will make things
> > > worse.
> > > >> The
> > > >> > >> > >> delaying logic is based off a delay queue. A separate
> > > expiration
> > > >> > >> thread
> > > >> > >> > >> just waits on the next to be expired request. So, it
> doesn't
> > > tie
> > > >> > up a
> > > >> > >> > >> request handler thread.
> > > >> > >> > >>
> > > >> > >> > >> Thanks,
> > > >> > >> > >>
> > > >> > >> > >> Jun
> > > >> > >> > >>
> > > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> > > ismael@juma.me.uk
> > > >> >
> > > >> > >> wrote:
> > > >> > >> > >>
> > > >> > >> > >> > Hi Jay,
> > > >> > >> > >> >
> > > >> > >> > >> > Regarding 1, I definitely like the simplicity of
> keeping a
> > > >> single
> > > >> > >> > >> throttle
> > > >> > >> > >> > time field in the response. The downside is that the
> > client
> > > >> > metrics
> > > >> > >> > >> will be
> > > >> > >> > >> > more coarse grained.
> > > >> > >> > >> >
> > > >> > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.
> > > percentage`
> > > >> > and
> > > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > > >> > >> > >> >
> > > >> > >> > >> > Ismael
> > > >> > >> > >> >
> > > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> > > jay@confluent.io>
> > > >> > >> wrote:
> > > >> > >> > >> >
> > > >> > >> > >> > > A few minor comments:
> > > >> > >> > >> > >
> > > >> > >> > >> > >    1. Isn't it the case that the throttling time
> > response
> > > >> field
> > > >> > >> > should
> > > >> > >> > >> > have
> > > >> > >> > >> > >    the total time your request was throttled
> > irrespective
> > > of
> > > >> > the
> > > >> > >> > >> quotas
> > > >> > >> > >> > > that
> > > >> > >> > >> > >    caused that. Limiting it to byte rate quota doesn't
> > > make
> > > >> > >> sense,
> > > >> > >> > >> but I
> > > >> > >> > >> > > also
> > > >> > >> > >> > >    I don't think we want to end up adding new fields
> in
> > > the
> > > >> > >> response
> > > >> > >> > >> for
> > > >> > >> > >> > > every
> > > >> > >> > >> > >    single thing we quota, right?
> > > >> > >> > >> > >    2. I don't think we should make this quota
> > specifically
> > > >> > about
> > > >> > >> io
> > > >> > >> > >> > >    threads. Once we introduce these quotas people set
> > them
> > > >> and
> > > >> > >> > expect
> > > >> > >> > >> > them
> > > >> > >> > >> > > to
> > > >> > >> > >> > >    be enforced (and if they aren't it may cause an
> > > outage).
> > > >> As
> > > >> > a
> > > >> > >> > >> result
> > > >> > >> > >> > > they
> > > >> > >> > >> > >    are a bit more sensitive than normal configs, I
> > think.
> > > >> The
> > > >> > >> > current
> > > >> > >> > >> > > thread
> > > >> > >> > >> > >    pools seem like something of an implementation
> detail
> > > and
> > > >> > not
> > > >> > >> the
> > > >> > >> > >> > level
> > > >> > >> > >> > > the
> > > >> > >> > >> > >    user-facing quotas should be involved with. I think
> > it
> > > >> might
> > > >> > >> be
> > > >> > >> > >> better
> > > >> > >> > >> > > to
> > > >> > >> > >> > >    make this a general request-time throttle with no
> > > >> mention in
> > > >> > >> the
> > > >> > >> > >> > naming
> > > >> > >> > >> > >    about I/O threads and simply acknowledge the
> current
> > > >> > >> limitation
> > > >> > >> > >> (which
> > > >> > >> > >> > > we
> > > >> > >> > >> > >    may someday fix) in the docs that this covers only
> > the
> > > >> time
> > > >> > >> after
> > > >> > >> > >> the
> > > >> > >> > >> > >    thread is read off the network.
> > > >> > >> > >> > >    3. As such I think the right interface to the user
> > > would
> > > >> be
> > > >> > >> > >> something
> > > >> > >> > >> > >    like percent_request_time and be in {0,...100} or
> > > >> > >> > >> request_time_ratio
> > > >> > >> > >> > > and be
> > > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the
> terminology
> > we
> > > >> used
> > > >> > >> if
> > > >> > >> > the
> > > >> > >> > >> > > scale
> > > >> > >> > >> > >    is between 0 and 1 in the other metrics, right?)
> > > >> > >> > >> > >
> > > >> > >> > >> > > -Jay
> > > >> > >> > >> > >
> > > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > > >> > >> > >> rajinisivaram@gmail.com
> > > >> > >> > >> > >
> > > >> > >> > >> > > wrote:
> > > >> > >> > >> > >
> > > >> > >> > >> > > > Guozhang/Dong,
> > > >> > >> > >> > > >
> > > >> > >> > >> > > > Thank you for the feedback.
> > > >> > >> > >> > > >
> > > >> > >> > >> > > > Guozhang : I have updated the section on
> co-existence
> > of
> > > >> byte
> > > >> > >> rate
> > > >> > >> > >> and
> > > >> > >> > >> > > > request time quotas.
> > > >> > >> > >> > > >
> > > >> > >> > >> > > > Dong: I hadn't added much detail to the metrics and
> > > >> sensors
> > > >> > >> since
> > > >> > >> > >> they
> > > >> > >> > >> > > are
> > > >> > >> > >> > > > going to be very similar to the existing metrics and
> > > >> sensors.
> > > >> > >> To
> > > >> > >> > >> avoid
> > > >> > >> > >> > > > confusion, I have now added more detail. All metrics
> > are
> > > >> in
> > > >> > the
> > > >> > >> > >> group
> > > >> > >> > >> > > > "quotaType" and all sensors have names starting with
> > > >> > >> "quotaType"
> > > >> > >> > >> (where
> > > >> > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > > >> > >> > >> > > > So there will be no reuse of existing
> metrics/sensors.
> > > The
> > > >> > new
> > > >> > >> > ones
> > > >> > >> > >> for
> > > >> > >> > >> > > > request processing time based throttling will be
> > > >> completely
> > > >> > >> > >> independent
> > > >> > >> > >> > > of
> > > >> > >> > >> > > > existing metrics/sensors, but will be consistent in
> > > >> format.
> > > >> > >> > >> > > >
> > > >> > >> > >> > > > The existing throttle_time_ms field in produce/fetch
> > > >> > responses
> > > >> > >> > will
> > > >> > >> > >> not
> > > >> > >> > >> > > be
> > > >> > >> > >> > > > impacted by this KIP. That will continue to return
> > > >> byte-rate
> > > >> > >> based
> > > >> > >> > >> > > > throttling times. In addition, a new field
> > > >> > >> > request_throttle_time_ms
> > > >> > >> > >> > will
> > > >> > >> > >> > > be
> > > >> > >> > >> > > > added to return request quota based throttling
> times.
> > > >> These
> > > >> > >> will
> > > >> > >> > be
> > > >> > >> > >> > > exposed
> > > >> > >> > >> > > > as new metrics on the client-side.
> > > >> > >> > >> > > >
> > > >> > >> > >> > > > Since all metrics and sensors are different for each
> > > type
> > > >> of
> > > >> > >> > quota,
> > > >> > >> > >> I
> > > >> > >> > >> > > > believe there is already sufficient metrics to
> monitor
> > > >> > >> throttling
> > > >> > >> > on
> > > >> > >> > >> > both
> > > >> > >> > >> > > > client and broker side for each type of throttling.
> > > >> > >> > >> > > >
> > > >> > >> > >> > > > Regards,
> > > >> > >> > >> > > >
> > > >> > >> > >> > > > Rajini
> > > >> > >> > >> > > >
> > > >> > >> > >> > > >
> > > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > > >> > lindong28@gmail.com
> > > >> > >> >
> > > >> > >> > >> wrote:
> > > >> > >> > >> > > >
> > > >> > >> > >> > > > > Hey Rajini,
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > > I think it makes a lot of sense to use
> > io_thread_units
> > > >> as
> > > >> > >> metric
> > > >> > >> > >> to
> > > >> > >> > >> > > quota
> > > >> > >> > >> > > > > user's traffic here. LGTM overall. I have some
> > > questions
> > > >> > >> > regarding
> > > >> > >> > >> > > > sensors.
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > > - Can you be more specific in the KIP what sensors
> > > will
> > > >> be
> > > >> > >> > added?
> > > >> > >> > >> For
> > > >> > >> > >> > > > > example, it will be useful to specify the name and
> > > >> > >> attributes of
> > > >> > >> > >> > these
> > > >> > >> > >> > > > new
> > > >> > >> > >> > > > > sensors.
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > > - We currently have throttle-time and queue-size
> for
> > > >> > >> byte-rate
> > > >> > >> > >> based
> > > >> > >> > >> > > > quota.
> > > >> > >> > >> > > > > Are you going to have separate throttle-time and
> > > >> queue-size
> > > >> > >> for
> > > >> > >> > >> > > requests
> > > >> > >> > >> > > > > throttled by io_thread_unit-based quota, or will
> > they
> > > >> share
> > > >> > >> the
> > > >> > >> > >> same
> > > >> > >> > >> > > > > sensor?
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > > - Does the throttle-time in the ProduceResponse
> and
> > > >> > >> > FetchResponse
> > > >> > >> > >> > > > contains
> > > >> > >> > >> > > > > time due to io_thread_unit-based quota?
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > > - Currently kafka server doesn't not provide any
> log
> > > or
> > > >> > >> metrics
> > > >> > >> > >> that
> > > >> > >> > >> > > > tells
> > > >> > >> > >> > > > > whether any given clientId (or user) is throttled.
> > > This
> > > >> is
> > > >> > >> not
> > > >> > >> > too
> > > >> > >> > >> > bad
> > > >> > >> > >> > > > > because we can still check the client-side
> byte-rate
> > > >> metric
> > > >> > >> to
> > > >> > >> > >> > validate
> > > >> > >> > >> > > > > whether a given client is throttled. But with this
> > > >> > >> > io_thread_unit,
> > > >> > >> > >> > > there
> > > >> > >> > >> > > > > will be no way to validate whether a given client
> is
> > > >> slow
> > > >> > >> > because
> > > >> > >> > >> it
> > > >> > >> > >> > > has
> > > >> > >> > >> > > > > exceeded its io_thread_unit limit. It is necessary
> > for
> > > >> user
> > > >> > >> to
> > > >> > >> > be
> > > >> > >> > >> > able
> > > >> > >> > >> > > to
> > > >> > >> > >> > > > > know this information to figure how whether they
> > have
> > > >> > reached
> > > >> > >> > >> there
> > > >> > >> > >> > > quota
> > > >> > >> > >> > > > > limit. How about we add log4j log on the server
> side
> > > to
> > > >> > >> > >> periodically
> > > >> > >> > >> > > > print
> > > >> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > > >> > >> > >> > io-thread-unit-throttle-time)
> > > >> > >> > >> > > so
> > > >> > >> > >> > > > > that kafka administrator can figure those users
> that
> > > >> have
> > > >> > >> > reached
> > > >> > >> > >> > their
> > > >> > >> > >> > > > > limit and act accordingly?
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > > Thanks,
> > > >> > >> > >> > > > > Dong
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> > > >> > >> > >> wangguoz@gmail.com>
> > > >> > >> > >> > > > wrote:
> > > >> > >> > >> > > > >
> > > >> > >> > >> > > > > > Made a pass over the doc, overall LGTM except a
> > > minor
> > > >> > >> comment
> > > >> > >> > on
> > > >> > >> > >> > the
> > > >> > >> > >> > > > > > throttling implementation:
> > > >> > >> > >> > > > > >
> > > >> > >> > >> > > > > > Stated as "Request processing time throttling
> will
> > > be
> > > >> > >> applied
> > > >> > >> > on
> > > >> > >> > >> > top
> > > >> > >> > >> > > if
> > > >> > >> > >> > > > > > necessary." I thought that it meant the request
> > > >> > processing
> > > >> > >> > time
> > > >> > >> > >> > > > > throttling
> > > >> > >> > >> > > > > > is applied first, but continue reading I found
> it
> > > >> > actually
> > > >> > >> > >> meant to
> > > >> > >> > >> > > > apply
> > > >> > >> > >> > > > > > produce / fetch byte rate throttling first.
> > > >> > >> > >> > > > > >
> > > >> > >> > >> > > > > > Also the last sentence "The remaining delay if
> any
> > > is
> > > >> > >> applied
> > > >> > >> > to
> > > >> > >> > >> > the
> > > >> > >> > >> > > > > > response." is a bit confusing to me. Maybe
> > rewording
> > > >> it a
> > > >> > >> bit?
> > > >> > >> > >> > > > > >
> > > >> > >> > >> > > > > >
> > > >> > >> > >> > > > > > Guozhang
> > > >> > >> > >> > > > > >
> > > >> > >> > >> > > > > >
> > > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> > > >> > jun@confluent.io
> > > >> > >> >
> > > >> > >> > >> wrote:
> > > >> > >> > >> > > > > >
> > > >> > >> > >> > > > > > > Hi, Rajini,
> > > >> > >> > >> > > > > > >
> > > >> > >> > >> > > > > > > Thanks for the updated KIP. The latest
> proposal
> > > >> looks
> > > >> > >> good
> > > >> > >> > to
> > > >> > >> > >> me.
> > > >> > >> > >> > > > > > >
> > > >> > >> > >> > > > > > > Jun
> > > >> > >> > >> > > > > > >
> > > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini
> Sivaram
> > <
> > > >> > >> > >> > > > > rajinisivaram@gmail.com
> > > >> > >> > >> > > > > > >
> > > >> > >> > >> > > > > > > wrote:
> > > >> > >> > >> > > > > > >
> > > >> > >> > >> > > > > > > > Jun/Roger,
> > > >> > >> > >> > > > > > > >
> > > >> > >> > >> > > > > > > > Thank you for the feedback.
> > > >> > >> > >> > > > > > > >
> > > >> > >> > >> > > > > > > > 1. I have updated the KIP to use absolute
> > units
> > > >> > >> instead of
> > > >> > >> > >> > > > > percentage.
> > > >> > >> > >> > > > > > > The
> > > >> > >> > >> > > > > > > > property is called* io_thread_units* to
> align
> > > with
> > > >> > the
> > > >> > >> > >> thread
> > > >> > >> > >> > > count
> > > >> > >> > >> > > > > > > > property *num.io.threads*. When we implement
> > > >> network
> > > >> > >> > thread
> > > >> > >> > >> > > > > utilization
> > > >> > >> > >> > > > > > > > quotas, we can add another property
> > > >> > >> > *network_thread_units.*
> > > >> > >> > >> > > > > > > >
> > > >> > >> > >> > > > > > > > 2. ControlledShutdown is already listed
> under
> > > the
> > > >> > >> exempt
> > > >> > >> > >> > > requests.
> > > >> > >> > >> > > > > Jun,
> > > >> > >> > >> > > > > > > did
> > > >> > >> > >> > > > > > > > you mean a different request that needs to
> be
> > > >> added?
> > > >> > >> The
> > > >> > >> > >> four
> > > >> > >> > >> > > > > requests
> > > >> > >> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> > > >> > >> > >> > ControlledShutdown,
> > > >> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are
> > > >> controlled
> > > >> > >> > using
> > > >> > >> > >> > > > > > ClusterAction
> > > >> > >> > >> > > > > > > > ACL, so it is easy to exclude and only
> > throttle
> > > if
> > > >> > >> > >> > unauthorized.
> > > >> > >> > >> > > I
> > > >> > >> > >> > > > > > wasn't
> > > >> > >> > >> > > > > > > > sure if there are other requests used only
> for
> > > >> > >> > inter-broker
> > > >> > >> > >> > that
> > > >> > >> > >> > > > > needed
> > > >> > >> > >> > > > > > > to
> > > >> > >> > >> > > > > > > > be excluded.
> > > >> > >> > >> > > > > > > >
> > > >> > >> > >> > > > > > > > 3. I was thinking the smallest change would
> be
> > > to
> > > >> > >> replace
> > > >> > >> > >> all
> > > >> > >> > >> > > > > > references
> > > >> > >> > >> > > > > > > to
> > > >> > >> > >> > > > > > > > *requestChannel.sendResponse()* with a
> local
> > > >> method
> > > >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the
> > > >> > throttling
> > > >> > >> if
> > > >> > >> > >> any
> > > >> > >> > >> > > plus
> > > >> > >> > >> > > > > send
> > > >> > >> > >> > > > > > > > response. If we throttle first in
> > > >> > *KafkaApis.handle()*,
> > > >> > >> > the
> > > >> > >> > >> > time
> > > >> > >> > >> > > > > spent
> > > >> > >> > >> > > > > > > > within the method handling the request will
> > not
> > > be
> > > >> > >> > recorded
> > > >> > >> > >> or
> > > >> > >> > >> > > used
> > > >> > >> > >> > > > > in
> > > >> > >> > >> > > > > > > > throttling. We can look into this again when
> > the
> > > >> PR
> > > >> > is
> > > >> > >> > ready
> > > >> > >> > >> > for
> > > >> > >> > >> > > > > > review.
> > > >> > >> > >> > > > > > > >
> > > >> > >> > >> > > > > > > > Regards,
> > > >> > >> > >> > > > > > > >
> > > >> > >> > >> > > > > > > > Rajini
> > > >> > >> > >> > > > > > > >
> > > >> > >> > >> > > > > > > >
> > > >> > >> > >> > > > > > > >
> > > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger
> Hoover
> > <
> > > >> > >> > >> > > > > roger.hoover@gmail.com>
> > > >> > >> > >> > > > > > > > wrote:
> > > >> > >> > >> > > > > > > >
> > > >> > >> > >> > > > > > > > > Great to see this KIP and the excellent
> > > >> discussion.
> > > >> > >> > >> > > > > > > > >
> > > >> > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If
> my
> > > >> > >> application
> > > >> > >> > is
> > > >> > >> > >> > > > > allocated
> > > >> > >> > >> > > > > > 1
> > > >> > >> > >> > > > > > > > > request handler unit, then it's as if I
> > have a
> > > >> > Kafka
> > > >> > >> > >> broker
> > > >> > >> > >> > > with
> > > >> > >> > >> > > > a
> > > >> > >> > >> > > > > > > single
> > > >> > >> > >> > > > > > > > > request handler thread dedicated to me.
> > > That's
> > > >> the
> > > >> > >> > most I
> > > >> > >> > >> > can
> > > >> > >> > >> > > > use,
> > > >> > >> > >> > > > > > at
> > > >> > >> > >> > > > > > > > > least.  That allocation doesn't change
> even
> > if
> > > >> an
> > > >> > >> admin
> > > >> > >> > >> later
> > > >> > >> > >> > > > > > increases
> > > >> > >> > >> > > > > > > > the
> > > >> > >> > >> > > > > > > > > size of the request thread pool on the
> > broker.
> > > >> > It's
> > > >> > >> > >> similar
> > > >> > >> > >> > to
> > > >> > >> > >> > > > the
> > > >> > >> > >> > > > > > CPU
> > > >> > >> > >> > > > > > > > > abstraction that VMs and containers get
> from
> > > >> > >> hypervisors
> > > >> > >> > >> or
> > > >> > >> > >> > OS
> > > >> > >> > >> > > > > > > > schedulers.
> > > >> > >> > >> > > > > > > > > While different client access patterns can
> > use
> > > >> > wildly
> > > >> > >> > >> > different
> > > >> > >> > >> > > > > > amounts
> > > >> > >> > >> > > > > > > > of
> > > >> > >> > >> > > > > > > > > request thread resources per request, a
> > given
> > > >> > >> > application
> > > >> > >> > >> > will
> > > >> > >> > >> > > > > > > generally
> > > >> > >> > >> > > > > > > > > have a stable access pattern and can
> figure
> > > out
> > > >> > >> > >> empirically
> > > >> > >> > >> > how
> > > >> > >> > >> > > > > many
> > > >> > >> > >> > > > > > > > > "request thread units" it needs to meet
> it's
> > > >> > >> > >> > throughput/latency
> > > >> > >> > >> > > > > > goals.
> > > >> > >> > >> > > > > > > > >
> > > >> > >> > >> > > > > > > > > Cheers,
> > > >> > >> > >> > > > > > > > >
> > > >> > >> > >> > > > > > > > > Roger
> > > >> > >> > >> > > > > > > > >
> > > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> > > >> > >> > >> jun@confluent.io>
> > > >> > >> > >> > > > wrote:
> > > >> > >> > >> > > > > > > > >
> > > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > > >> > >> > >> > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > Thanks for the updated KIP. A few more
> > > >> comments.
> > > >> > >> > >> > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > 1. A concern of request_time_percent is
> > that
> > > >> it's
> > > >> > >> not
> > > >> > >> > an
> > > >> > >> > >> > > > absolute
> > > >> > >> > >> > > > > > > > value.
> > > >> > >> > >> > > > > > > > > > Let's say you give a user a 10% limit.
> If
> > > the
> > > >> > admin
> > > >> > >> > >> doubles
> > > >> > >> > >> > > the
> > > >> > >> > >> > > > > > > number
> > > >> > >> > >> > > > > > > > of
> > > >> > >> > >> > > > > > > > > > request handler threads, that user now
> > > >> actually
> > > >> > has
> > > >> > >> > >> twice
> > > >> > >> > >> > the
> > > >> > >> > >> > > > > > > absolute
> > > >> > >> > >> > > > > > > > > > capacity. This may confuse people a bit.
> > So,
> > > >> > >> perhaps
> > > >> > >> > >> > setting
> > > >> > >> > >> > > > the
> > > >> > >> > >> > > > > > > quota
> > > >> > >> > >> > > > > > > > > > based on an absolute request thread unit
> > is
> > > >> > better.
> > > >> > >> > >> > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
> > > >> > >> inter-broker
> > > >> > >> > >> > request
> > > >> > >> > >> > > > and
> > > >> > >> > >> > > > > > > needs
> > > >> > >> > >> > > > > > > > to
> > > >> > >> > >> > > > > > > > > > be excluded from throttling.
> > > >> > >> > >> > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > 3. Implementation wise, I am wondering
> if
> > > it's
> > > >> > >> simpler
> > > >> > >> > >> to
> > > >> > >> > >> > > apply
> > > >> > >> > >> > > > > the
> > > >> > >> > >> > > > > > > > > request
> > > >> > >> > >> > > > > > > > > > time throttling first in
> > KafkaApis.handle().
> > > >> > >> > Otherwise,
> > > >> > >> > >> we
> > > >> > >> > >> > > will
> > > >> > >> > >> > > > > > need
> > > >> > >> > >> > > > > > > to
> > > >> > >> > >> > > > > > > > > add
> > > >> > >> > >> > > > > > > > > > the throttling logic in each type of
> > > request.
> > > >> > >> > >> > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > Thanks,
> > > >> > >> > >> > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > Jun
> > > >> > >> > >> > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini
> > > >> Sivaram <
> > > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > > >> > >> > >> > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > wrote:
> > > >> > >> > >> > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > Jun,
> > > >> > >> > >> > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > Thank you for the review.
> > > >> > >> > >> > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > I have reverted to the original KIP
> that
> > > >> > >> throttles
> > > >> > >> > >> based
> > > >> > >> > >> > on
> > > >> > >> > >> > > > > > request
> > > >> > >> > >> > > > > > > > > > handler
> > > >> > >> > >> > > > > > > > > > > utilization. At the moment, it uses
> > > >> percentage,
> > > >> > >> but
> > > >> > >> > I
> > > >> > >> > >> am
> > > >> > >> > >> > > > happy
> > > >> > >> > >> > > > > to
> > > >> > >> > >> > > > > > > > > change
> > > >> > >> > >> > > > > > > > > > to
> > > >> > >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100)
> if
> > > >> > >> required. I
> > > >> > >> > >> have
> > > >> > >> > >> > > > added
> > > >> > >> > >> > > > > > the
> > > >> > >> > >> > > > > > > > > > examples
> > > >> > >> > >> > > > > > > > > > > from this discussion to the KIP. Also
> > > added
> > > >> a
> > > >> > >> > "Future
> > > >> > >> > >> > Work"
> > > >> > >> > >> > > > > > section
> > > >> > >> > >> > > > > > > > to
> > > >> > >> > >> > > > > > > > > > > address network thread utilization.
> The
> > > >> > >> > configuration
> > > >> > >> > >> is
> > > >> > >> > >> > > > named
> > > >> > >> > >> > > > > > > > > > > "request_time_percent" with the
> > > expectation
> > > >> > that
> > > >> > >> it
> > > >> > >> > >> can
> > > >> > >> > >> > > also
> > > >> > >> > >> > > > be
> > > >> > >> > >> > > > > > > used
> > > >> > >> > >> > > > > > > > as
> > > >> > >> > >> > > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > limit for network thread utilization
> > when
> > > >> that
> > > >> > is
> > > >> > >> > >> > > > implemented,
> > > >> > >> > >> > > > > so
> > > >> > >> > >> > > > > > > > that
> > > >> > >> > >> > > > > > > > > > > users have to set only one config for
> > the
> > > >> two
> > > >> > and
> > > >> > >> > not
> > > >> > >> > >> > have
> > > >> > >> > >> > > to
> > > >> > >> > >> > > > > > worry
> > > >> > >> > >> > > > > > > > > about
> > > >> > >> > >> > > > > > > > > > > the internal distribution of the work
> > > >> between
> > > >> > the
> > > >> > >> > two
> > > >> > >> > >> > > thread
> > > >> > >> > >> > > > > > pools
> > > >> > >> > >> > > > > > > in
> > > >> > >> > >> > > > > > > > > > > Kafka.
> > > >> > >> > >> > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > Regards,
> > > >> > >> > >> > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > Rajini
> > > >> > >> > >> > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun
> > Rao
> > > <
> > > >> > >> > >> > > jun@confluent.io>
> > > >> > >> > >> > > > > > > wrote:
> > > >> > >> > >> > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > The benefit of using the request
> > > >> processing
> > > >> > >> time
> > > >> > >> > >> over
> > > >> > >> > >> > the
> > > >> > >> > >> > > > > > request
> > > >> > >> > >> > > > > > > > > rate
> > > >> > >> > >> > > > > > > > > > is
> > > >> > >> > >> > > > > > > > > > > > exactly what people have said. I
> will
> > > just
> > > >> > >> expand
> > > >> > >> > >> that
> > > >> > >> > >> > a
> > > >> > >> > >> > > > bit.
> > > >> > >> > >> > > > > > > > > Consider
> > > >> > >> > >> > > > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > > following case. The producer sends a
> > > >> produce
> > > >> > >> > request
> > > >> > >> > >> > > with a
> > > >> > >> > >> > > > > > 10MB
> > > >> > >> > >> > > > > > > > > > message
> > > >> > >> > >> > > > > > > > > > > > but compressed to 100KB with gzip.
> The
> > > >> > >> > >> decompression of
> > > >> > >> > >> > > the
> > > >> > >> > >> > > > > > > message
> > > >> > >> > >> > > > > > > > > on
> > > >> > >> > >> > > > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > > broker could take 10-15 seconds,
> > during
> > > >> which
> > > >> > >> > time,
> > > >> > >> > >> a
> > > >> > >> > >> > > > request
> > > >> > >> > >> > > > > > > > handler
> > > >> > >> > >> > > > > > > > > > > > thread is completely blocked. In
> this
> > > >> case,
> > > >> > >> > neither
> > > >> > >> > >> the
> > > >> > >> > >> > > > > byte-in
> > > >> > >> > >> > > > > > > > quota
> > > >> > >> > >> > > > > > > > > > nor
> > > >> > >> > >> > > > > > > > > > > > the request rate quota may be
> > effective
> > > in
> > > >> > >> > >> protecting
> > > >> > >> > >> > the
> > > >> > >> > >> > > > > > broker.
> > > >> > >> > >> > > > > > > > > > > Consider
> > > >> > >> > >> > > > > > > > > > > > another case. A consumer group
> starts
> > > >> with 10
> > > >> > >> > >> instances
> > > >> > >> > >> > > and
> > > >> > >> > >> > > > > > later
> > > >> > >> > >> > > > > > > > on
> > > >> > >> > >> > > > > > > > > > > > switches to 20 instances. The
> request
> > > rate
> > > >> > will
> > > >> > >> > >> likely
> > > >> > >> > >> > > > > double,
> > > >> > >> > >> > > > > > > but
> > > >> > >> > >> > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > > actually load on the broker may not
> > > double
> > > >> > >> since
> > > >> > >> > >> each
> > > >> > >> > >> > > fetch
> > > >> > >> > >> > > > > > > request
> > > >> > >> > >> > > > > > > > > > only
> > > >> > >> > >> > > > > > > > > > > > contains half of the partitions.
> > Request
> > > >> rate
> > > >> > >> > quota
> > > >> > >> > >> may
> > > >> > >> > >> > > not
> > > >> > >> > >> > > > > be
> > > >> > >> > >> > > > > > > easy
> > > >> > >> > >> > > > > > > > > to
> > > >> > >> > >> > > > > > > > > > > > configure in this case.
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > What we really want is to be able to
> > > >> prevent
> > > >> > a
> > > >> > >> > >> client
> > > >> > >> > >> > > from
> > > >> > >> > >> > > > > > using
> > > >> > >> > >> > > > > > > > too
> > > >> > >> > >> > > > > > > > > > much
> > > >> > >> > >> > > > > > > > > > > > of the server side resources. In
> this
> > > >> > >> particular
> > > >> > >> > >> KIP,
> > > >> > >> > >> > > this
> > > >> > >> > >> > > > > > > resource
> > > >> > >> > >> > > > > > > > > is
> > > >> > >> > >> > > > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > > capacity of the request handler
> > > threads. I
> > > >> > >> agree
> > > >> > >> > >> that
> > > >> > >> > >> > it
> > > >> > >> > >> > > > may
> > > >> > >> > >> > > > > > not
> > > >> > >> > >> > > > > > > be
> > > >> > >> > >> > > > > > > > > > > > intuitive for the users to determine
> > how
> > > >> to
> > > >> > set
> > > >> > >> > the
> > > >> > >> > >> > right
> > > >> > >> > >> > > > > > limit.
> > > >> > >> > >> > > > > > > > > > However,
> > > >> > >> > >> > > > > > > > > > > > this is not completely new and has
> > been
> > > >> done
> > > >> > in
> > > >> > >> > the
> > > >> > >> > >> > > > container
> > > >> > >> > >> > > > > > > world
> > > >> > >> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> > > >> > >> > >> > > > > https://access.redhat.com/
> > > >> > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > > >> > >> > >> terprise_Linux/6/html/
> > > >> > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-
> > cpu.html)
> > > >> has
> > > >> > >> the
> > > >> > >> > >> > concept
> > > >> > >> > >> > > of
> > > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > > >> > >> > >> > > > > > > > > > > > which specifies the total amount of
> > time
> > > >> in
> > > >> > >> > >> > microseconds
> > > >> > >> > >> > > > for
> > > >> > >> > >> > > > > > > which
> > > >> > >> > >> > > > > > > > > all
> > > >> > >> > >> > > > > > > > > > > > tasks in a cgroup can run during a
> one
> > > >> second
> > > >> > >> > >> period.
> > > >> > >> > >> > We
> > > >> > >> > >> > > > can
> > > >> > >> > >> > > > > > > > > > potentially
> > > >> > >> > >> > > > > > > > > > > > model the request handler threads
> in a
> > > >> > similar
> > > >> > >> > way.
> > > >> > >> > >> For
> > > >> > >> > >> > > > > > example,
> > > >> > >> > >> > > > > > > > each
> > > >> > >> > >> > > > > > > > > > > > request handler thread can be 1
> > request
> > > >> > handler
> > > >> > >> > unit
> > > >> > >> > >> > and
> > > >> > >> > >> > > > the
> > > >> > >> > >> > > > > > > admin
> > > >> > >> > >> > > > > > > > > can
> > > >> > >> > >> > > > > > > > > > > > configure a limit on how many units
> > (say
> > > >> > 0.01)
> > > >> > >> a
> > > >> > >> > >> client
> > > >> > >> > >> > > can
> > > >> > >> > >> > > > > > have.
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > Regarding not throttling the
> internal
> > > >> broker
> > > >> > to
> > > >> > >> > >> broker
> > > >> > >> > >> > > > > > requests.
> > > >> > >> > >> > > > > > > We
> > > >> > >> > >> > > > > > > > > > could
> > > >> > >> > >> > > > > > > > > > > > do that. Alternatively, we could
> just
> > > let
> > > >> the
> > > >> > >> > admin
> > > >> > >> > >> > > > > configure a
> > > >> > >> > >> > > > > > > > high
> > > >> > >> > >> > > > > > > > > > > limit
> > > >> > >> > >> > > > > > > > > > > > for the kafka user (it may not be
> able
> > > to
> > > >> do
> > > >> > >> that
> > > >> > >> > >> > easily
> > > >> > >> > >> > > > > based
> > > >> > >> > >> > > > > > on
> > > >> > >> > >> > > > > > > > > > > clientId
> > > >> > >> > >> > > > > > > > > > > > though).
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > Ideally we want to be able to
> protect
> > > the
> > > >> > >> > >> utilization
> > > >> > >> > >> > of
> > > >> > >> > >> > > > the
> > > >> > >> > >> > > > > > > > network
> > > >> > >> > >> > > > > > > > > > > thread
> > > >> > >> > >> > > > > > > > > > > > pool too. The difficult is mostly
> what
> > > >> Rajini
> > > >> > >> > said:
> > > >> > >> > >> (1)
> > > >> > >> > >> > > The
> > > >> > >> > >> > > > > > > > mechanism
> > > >> > >> > >> > > > > > > > > > for
> > > >> > >> > >> > > > > > > > > > > > throttling the requests is through
> > > >> Purgatory
> > > >> > >> and
> > > >> > >> > we
> > > >> > >> > >> > will
> > > >> > >> > >> > > > have
> > > >> > >> > >> > > > > > to
> > > >> > >> > >> > > > > > > > > think
> > > >> > >> > >> > > > > > > > > > > > through how to integrate that into
> the
> > > >> > network
> > > >> > >> > >> layer.
> > > >> > >> > >> > > (2)
> > > >> > >> > >> > > > In
> > > >> > >> > >> > > > > > the
> > > >> > >> > >> > > > > > > > > > network
> > > >> > >> > >> > > > > > > > > > > > layer, currently we know the user,
> but
> > > not
> > > >> > the
> > > >> > >> > >> clientId
> > > >> > >> > >> > > of
> > > >> > >> > >> > > > > the
> > > >> > >> > >> > > > > > > > > request.
> > > >> > >> > >> > > > > > > > > > > So,
> > > >> > >> > >> > > > > > > > > > > > it's a bit tricky to throttle based
> on
> > > >> > clientId
> > > >> > >> > >> there.
> > > >> > >> > >> > > > Plus,
> > > >> > >> > >> > > > > > the
> > > >> > >> > >> > > > > > > > > > byteOut
> > > >> > >> > >> > > > > > > > > > > > quota can already protect the
> network
> > > >> thread
> > > >> > >> > >> > utilization
> > > >> > >> > >> > > > for
> > > >> > >> > >> > > > > > > fetch
> > > >> > >> > >> > > > > > > > > > > > requests. So, if we can't figure out
> > > this
> > > >> > part
> > > >> > >> > right
> > > >> > >> > >> > now,
> > > >> > >> > >> > > > > just
> > > >> > >> > >> > > > > > > > > focusing
> > > >> > >> > >> > > > > > > > > > > on
> > > >> > >> > >> > > > > > > > > > > > the request handling threads for
> this
> > > KIP
> > > >> is
> > > >> > >> > still a
> > > >> > >> > >> > > useful
> > > >> > >> > >> > > > > > > > feature.
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > Thanks,
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > Jun
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM,
> > Rajini
> > > >> > >> Sivaram <
> > > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > wrote:
> > > >> > >> > >> > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > Thank you all for the feedback.
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > Jay: I have removed exemption for
> > > >> consumer
> > > >> > >> > >> heartbeat
> > > >> > >> > >> > > etc.
> > > >> > >> > >> > > > > > Agree
> > > >> > >> > >> > > > > > > > > that
> > > >> > >> > >> > > > > > > > > > > > > protecting the cluster is more
> > > important
> > > >> > than
> > > >> > >> > >> > > protecting
> > > >> > >> > >> > > > > > > > individual
> > > >> > >> > >> > > > > > > > > > > apps.
> > > >> > >> > >> > > > > > > > > > > > > Have retained the exemption for
> > > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > > >> > >> > >> > > > > > etc,
> > > >> > >> > >> > > > > > > > > these
> > > >> > >> > >> > > > > > > > > > > are
> > > >> > >> > >> > > > > > > > > > > > > throttled only if authorization
> > fails
> > > >> (so
> > > >> > >> can't
> > > >> > >> > be
> > > >> > >> > >> > used
> > > >> > >> > >> > > > for
> > > >> > >> > >> > > > > > DoS
> > > >> > >> > >> > > > > > > > > > attacks
> > > >> > >> > >> > > > > > > > > > > > in
> > > >> > >> > >> > > > > > > > > > > > > a secure cluster, but allows
> > > >> inter-broker
> > > >> > >> > >> requests to
> > > >> > >> > >> > > > > > complete
> > > >> > >> > >> > > > > > > > > > without
> > > >> > >> > >> > > > > > > > > > > > > delays).
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > I will wait another day to see if
> > > these
> > > >> is
> > > >> > >> any
> > > >> > >> > >> > > objection
> > > >> > >> > >> > > > to
> > > >> > >> > >> > > > > > > > quotas
> > > >> > >> > >> > > > > > > > > > > based
> > > >> > >> > >> > > > > > > > > > > > on
> > > >> > >> > >> > > > > > > > > > > > > request processing time (as
> opposed
> > to
> > > >> > >> request
> > > >> > >> > >> rate)
> > > >> > >> > >> > > and
> > > >> > >> > >> > > > if
> > > >> > >> > >> > > > > > > there
> > > >> > >> > >> > > > > > > > > are
> > > >> > >> > >> > > > > > > > > > > no
> > > >> > >> > >> > > > > > > > > > > > > objections, I will revert to the
> > > >> original
> > > >> > >> > proposal
> > > >> > >> > >> > with
> > > >> > >> > >> > > > > some
> > > >> > >> > >> > > > > > > > > changes.
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > The original proposal was only
> > > including
> > > >> > the
> > > >> > >> > time
> > > >> > >> > >> > used
> > > >> > >> > >> > > by
> > > >> > >> > >> > > > > the
> > > >> > >> > >> > > > > > > > > request
> > > >> > >> > >> > > > > > > > > > > > > handler threads (that made
> > calculation
> > > >> > >> easy). I
> > > >> > >> > >> think
> > > >> > >> > >> > > the
> > > >> > >> > >> > > > > > > > > suggestion
> > > >> > >> > >> > > > > > > > > > is
> > > >> > >> > >> > > > > > > > > > > > to
> > > >> > >> > >> > > > > > > > > > > > > include the time spent in the
> > network
> > > >> > >> threads as
> > > >> > >> > >> well
> > > >> > >> > >> > > > since
> > > >> > >> > >> > > > > > > that
> > > >> > >> > >> > > > > > > > > may
> > > >> > >> > >> > > > > > > > > > be
> > > >> > >> > >> > > > > > > > > > > > > significant. As Jay pointed out,
> it
> > is
> > > >> more
> > > >> > >> > >> > complicated
> > > >> > >> > >> > > > to
> > > >> > >> > >> > > > > > > > > calculate
> > > >> > >> > >> > > > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > > > total available CPU time and
> convert
> > > to
> > > >> a
> > > >> > >> ratio
> > > >> > >> > >> when
> > > >> > >> > >> > > > there
> > > >> > >> > >> > > > > > *m*
> > > >> > >> > >> > > > > > > > I/O
> > > >> > >> > >> > > > > > > > > > > > threads
> > > >> > >> > >> > > > > > > > > > > > > and *n* network threads.
> > > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > > >> > >> > >> > > )
> > > >> > >> > >> > > > > may
> > > >> > >> > >> > > > > > > > give
> > > >> > >> > >> > > > > > > > > us
> > > >> > >> > >> > > > > > > > > > > > what
> > > >> > >> > >> > > > > > > > > > > > > we want, but it can be very
> > expensive
> > > on
> > > >> > some
> > > >> > >> > >> > > platforms.
> > > >> > >> > >> > > > As
> > > >> > >> > >> > > > > > > > Becket
> > > >> > >> > >> > > > > > > > > > and
> > > >> > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do
> > have
> > > >> > several
> > > >> > >> > time
> > > >> > >> > >> > > > > > measurements
> > > >> > >> > >> > > > > > > > > > already
> > > >> > >> > >> > > > > > > > > > > > for
> > > >> > >> > >> > > > > > > > > > > > > generating metrics that we could
> > use,
> > > >> > though
> > > >> > >> we
> > > >> > >> > >> might
> > > >> > >> > >> > > > want
> > > >> > >> > >> > > > > to
> > > >> > >> > >> > > > > > > > > switch
> > > >> > >> > >> > > > > > > > > > to
> > > >> > >> > >> > > > > > > > > > > > > nanoTime() instead of
> > > >> currentTimeMillis()
> > > >> > >> since
> > > >> > >> > >> some
> > > >> > >> > >> > of
> > > >> > >> > >> > > > the
> > > >> > >> > >> > > > > > > > values
> > > >> > >> > >> > > > > > > > > > for
> > > >> > >> > >> > > > > > > > > > > > > small requests may be < 1ms. But
> > > rather
> > > >> > than
> > > >> > >> add
> > > >> > >> > >> up
> > > >> > >> > >> > the
> > > >> > >> > >> > > > > time
> > > >> > >> > >> > > > > > > > spent
> > > >> > >> > >> > > > > > > > > in
> > > >> > >> > >> > > > > > > > > > > I/O
> > > >> > >> > >> > > > > > > > > > > > > thread and network thread,
> wouldn't
> > it
> > > >> be
> > > >> > >> better
> > > >> > >> > >> to
> > > >> > >> > >> > > > convert
> > > >> > >> > >> > > > > > the
> > > >> > >> > >> > > > > > > > > time
> > > >> > >> > >> > > > > > > > > > > > spent
> > > >> > >> > >> > > > > > > > > > > > > on each thread into a separate
> > ratio?
> > > >> UserA
> > > >> > >> has
> > > >> > >> > a
> > > >> > >> > >> > > request
> > > >> > >> > >> > > > > > quota
> > > >> > >> > >> > > > > > > > of
> > > >> > >> > >> > > > > > > > > > 5%.
> > > >> > >> > >> > > > > > > > > > > > Can
> > > >> > >> > >> > > > > > > > > > > > > we take that to mean that UserA
> can
> > > use
> > > >> 5%
> > > >> > of
> > > >> > >> > the
> > > >> > >> > >> > time
> > > >> > >> > >> > > on
> > > >> > >> > >> > > > > > > network
> > > >> > >> > >> > > > > > > > > > > threads
> > > >> > >> > >> > > > > > > > > > > > > and 5% of the time on I/O threads?
> > If
> > > >> > either
> > > >> > >> is
> > > >> > >> > >> > > exceeded,
> > > >> > >> > >> > > > > the
> > > >> > >> > >> > > > > > > > > > response
> > > >> > >> > >> > > > > > > > > > > is
> > > >> > >> > >> > > > > > > > > > > > > throttled - it would mean
> > maintaining
> > > >> two
> > > >> > >> sets
> > > >> > >> > of
> > > >> > >> > >> > > metrics
> > > >> > >> > >> > > > > for
> > > >> > >> > >> > > > > > > the
> > > >> > >> > >> > > > > > > > > two
> > > >> > >> > >> > > > > > > > > > > > > durations, but would result in
> more
> > > >> > >> meaningful
> > > >> > >> > >> > ratios.
> > > >> > >> > >> > > We
> > > >> > >> > >> > > > > > could
> > > >> > >> > >> > > > > > > > > > define
> > > >> > >> > >> > > > > > > > > > > > two
> > > >> > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of
> > request
> > > >> > threads
> > > >> > >> > and
> > > >> > >> > >> 10%
> > > >> > >> > >> > > of
> > > >> > >> > >> > > > > > > network
> > > >> > >> > >> > > > > > > > > > > > threads),
> > > >> > >> > >> > > > > > > > > > > > > but that seems unnecessary and
> > harder
> > > to
> > > >> > >> explain
> > > >> > >> > >> to
> > > >> > >> > >> > > > users.
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > Back to why and how quotas are
> > applied
> > > >> to
> > > >> > >> > network
> > > >> > >> > >> > > thread
> > > >> > >> > >> > > > > > > > > utilization:
> > > >> > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the time
> > > >> spent in
> > > >> > >> the
> > > >> > >> > >> > network
> > > >> > >> > >> > > > > > thread
> > > >> > >> > >> > > > > > > > may
> > > >> > >> > >> > > > > > > > > be
> > > >> > >> > >> > > > > > > > > > > > > significant and I can see the need
> > to
> > > >> > include
> > > >> > >> > >> this.
> > > >> > >> > >> > Are
> > > >> > >> > >> > > > > there
> > > >> > >> > >> > > > > > > > other
> > > >> > >> > >> > > > > > > > > > > > > requests where the network thread
> > > >> > >> utilization is
> > > >> > >> > >> > > > > significant?
> > > >> > >> > >> > > > > > > In
> > > >> > >> > >> > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > case
> > > >> > >> > >> > > > > > > > > > > > > of fetch, request handler thread
> > > >> > utilization
> > > >> > >> > would
> > > >> > >> > >> > > > throttle
> > > >> > >> > >> > > > > > > > clients
> > > >> > >> > >> > > > > > > > > > > with
> > > >> > >> > >> > > > > > > > > > > > > high request rate, low data volume
> > and
> > > >> > fetch
> > > >> > >> > byte
> > > >> > >> > >> > rate
> > > >> > >> > >> > > > > quota
> > > >> > >> > >> > > > > > > will
> > > >> > >> > >> > > > > > > > > > > > throttle
> > > >> > >> > >> > > > > > > > > > > > > clients with high data volume.
> > Network
> > > >> > thread
> > > >> > >> > >> > > utilization
> > > >> > >> > >> > > > > is
> > > >> > >> > >> > > > > > > > > perhaps
> > > >> > >> > >> > > > > > > > > > > > > proportional to the data volume. I
> > am
> > > >> > >> wondering
> > > >> > >> > >> if we
> > > >> > >> > >> > > > even
> > > >> > >> > >> > > > > > need
> > > >> > >> > >> > > > > > > > to
> > > >> > >> > >> > > > > > > > > > > > throttle
> > > >> > >> > >> > > > > > > > > > > > > based on network thread
> utilization
> > or
> > > >> > >> whether
> > > >> > >> > the
> > > >> > >> > >> > data
> > > >> > >> > >> > > > > > volume
> > > >> > >> > >> > > > > > > > > quota
> > > >> > >> > >> > > > > > > > > > > > covers
> > > >> > >> > >> > > > > > > > > > > > > this case.
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > b) At the moment, we record and
> > check
> > > >> for
> > > >> > >> quota
> > > >> > >> > >> > > violation
> > > >> > >> > >> > > > > at
> > > >> > >> > >> > > > > > > the
> > > >> > >> > >> > > > > > > > > same
> > > >> > >> > >> > > > > > > > > > > > time.
> > > >> > >> > >> > > > > > > > > > > > > If a quota is violated, the
> response
> > > is
> > > >> > >> delayed.
> > > >> > >> > >> > Using
> > > >> > >> > >> > > > > Jay'e
> > > >> > >> > >> > > > > > > > > example
> > > >> > >> > >> > > > > > > > > > of
> > > >> > >> > >> > > > > > > > > > > > > disk reads for fetches happening
> in
> > > the
> > > >> > >> network
> > > >> > >> > >> > thread,
> > > >> > >> > >> > > > We
> > > >> > >> > >> > > > > > > can't
> > > >> > >> > >> > > > > > > > > > record
> > > >> > >> > >> > > > > > > > > > > > and
> > > >> > >> > >> > > > > > > > > > > > > delay a response after the disk
> > reads.
> > > >> We
> > > >> > >> could
> > > >> > >> > >> > record
> > > >> > >> > >> > > > the
> > > >> > >> > >> > > > > > time
> > > >> > >> > >> > > > > > > > > spent
> > > >> > >> > >> > > > > > > > > > > on
> > > >> > >> > >> > > > > > > > > > > > > the network thread when the
> response
> > > is
> > > >> > >> complete
> > > >> > >> > >> and
> > > >> > >> > >> > > > > > introduce
> > > >> > >> > >> > > > > > > a
> > > >> > >> > >> > > > > > > > > > delay
> > > >> > >> > >> > > > > > > > > > > > for
> > > >> > >> > >> > > > > > > > > > > > > handling a subsequent request
> > > (separate
> > > >> out
> > > >> > >> > >> recording
> > > >> > >> > >> > > and
> > > >> > >> > >> > > > > > quota
> > > >> > >> > >> > > > > > > > > > > violation
> > > >> > >> > >> > > > > > > > > > > > > handling in the case of network
> > thread
> > > >> > >> > overload).
> > > >> > >> > >> > Does
> > > >> > >> > >> > > > that
> > > >> > >> > >> > > > > > > make
> > > >> > >> > >> > > > > > > > > > sense?
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > Regards,
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > Rajini
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM,
> > > Becket
> > > >> > Qin <
> > > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > > >> > >> > >> > > > > > > > > > > > wrote:
> > > >> > >> > >> > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > > >> > >> > >> > > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the
> > CPU
> > > >> time
> > > >> > >> is a
> > > >> > >> > >> > little
> > > >> > >> > >> > > > > > > tricky. I
> > > >> > >> > >> > > > > > > > > am
> > > >> > >> > >> > > > > > > > > > > > > thinking
> > > >> > >> > >> > > > > > > > > > > > > > that maybe we can use the
> existing
> > > >> > request
> > > >> > >> > >> > > statistics.
> > > >> > >> > >> > > > > They
> > > >> > >> > >> > > > > > > are
> > > >> > >> > >> > > > > > > > > > > already
> > > >> > >> > >> > > > > > > > > > > > > > very detailed so we can probably
> > see
> > > >> the
> > > >> > >> > >> > approximate
> > > >> > >> > >> > > > CPU
> > > >> > >> > >> > > > > > time
> > > >> > >> > >> > > > > > > > > from
> > > >> > >> > >> > > > > > > > > > > it,
> > > >> > >> > >> > > > > > > > > > > > > e.g.
> > > >> > >> > >> > > > > > > > > > > > > > something like (total_time -
> > > >> > >> > >> > > > request/response_queue_time
> > > >> > >> > >> > > > > -
> > > >> > >> > >> > > > > > > > > > > > remote_time).
> > > >> > >> > >> > > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > > I agree with Guozhang that when
> a
> > > >> user is
> > > >> > >> > >> throttled
> > > >> > >> > >> > > it
> > > >> > >> > >> > > > is
> > > >> > >> > >> > > > > > > > likely
> > > >> > >> > >> > > > > > > > > > that
> > > >> > >> > >> > > > > > > > > > > > we
> > > >> > >> > >> > > > > > > > > > > > > > need to see if anything has went
> > > wrong
> > > >> > >> first,
> > > >> > >> > >> and
> > > >> > >> > >> > if
> > > >> > >> > >> > > > the
> > > >> > >> > >> > > > > > > users
> > > >> > >> > >> > > > > > > > > are
> > > >> > >> > >> > > > > > > > > > > well
> > > >> > >> > >> > > > > > > > > > > > > > behaving and just need more
> > > >> resources, we
> > > >> > >> will
> > > >> > >> > >> have
> > > >> > >> > >> > > to
> > > >> > >> > >> > > > > bump
> > > >> > >> > >> > > > > > > up
> > > >> > >> > >> > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > > quota
> > > >> > >> > >> > > > > > > > > > > > > > for them. It is true that
> > > >> pre-allocating
> > > >> > >> CPU
> > > >> > >> > >> time
> > > >> > >> > >> > > quota
> > > >> > >> > >> > > > > > > > precisely
> > > >> > >> > >> > > > > > > > > > for
> > > >> > >> > >> > > > > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > > > > users is difficult. So in
> practice
> > > it
> > > >> > would
> > > >> > >> > >> > probably
> > > >> > >> > >> > > be
> > > >> > >> > >> > > > > > more
> > > >> > >> > >> > > > > > > > like
> > > >> > >> > >> > > > > > > > > > > first
> > > >> > >> > >> > > > > > > > > > > > > set
> > > >> > >> > >> > > > > > > > > > > > > > a relative high protective CPU
> > time
> > > >> quota
> > > >> > >> for
> > > >> > >> > >> > > everyone
> > > >> > >> > >> > > > > and
> > > >> > >> > >> > > > > > > > > increase
> > > >> > >> > >> > > > > > > > > > > > that
> > > >> > >> > >> > > > > > > > > > > > > > for some individual clients on
> > > demand.
> > > >> > >> > >> > > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > > >> > >> > >> > > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > >> > >> > >> > > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM,
> > > >> Guozhang
> > > >> > >> > Wang <
> > > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > > >> > >> > >> > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > wrote:
> > > >> > >> > >> > > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > > > This is a great proposal, glad
> > to
> > > >> see
> > > >> > it
> > > >> > >> > >> > happening.
> > > >> > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU
> > > >> throttling, or
> > > >> > >> more
> > > >> > >> > >> > > > > specifically
> > > >> > >> > >> > > > > > > > > > > processing
> > > >> > >> > >> > > > > > > > > > > > > time
> > > >> > >> > >> > > > > > > > > > > > > > > ratio instead of the request
> > rate
> > > >> > >> throttling
> > > >> > >> > >> as
> > > >> > >> > >> > > well.
> > > >> > >> > >> > > > > > > Becket
> > > >> > >> > >> > > > > > > > > has
> > > >> > >> > >> > > > > > > > > > > very
> > > >> > >> > >> > > > > > > > > > > > > > well
> > > >> > >> > >> > > > > > > > > > > > > > > summed my rationales above,
> and
> > > one
> > > >> > >> thing to
> > > >> > >> > >> add
> > > >> > >> > >> > > here
> > > >> > >> > >> > > > > is
> > > >> > >> > >> > > > > > > that
> > > >> > >> > >> > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > > > former
> > > >> > >> > >> > > > > > > > > > > > > > > has a good support for both
> > > >> "protecting
> > > >> > >> > >> against
> > > >> > >> > >> > > rogue
> > > >> > >> > >> > > > > > > > clients"
> > > >> > >> > >> > > > > > > > > as
> > > >> > >> > >> > > > > > > > > > > > well
> > > >> > >> > >> > > > > > > > > > > > > as
> > > >> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> > > >> multi-tenancy
> > > >> > >> > usage":
> > > >> > >> > >> > when
> > > >> > >> > >> > > > > > > thinking
> > > >> > >> > >> > > > > > > > > > about
> > > >> > >> > >> > > > > > > > > > > > how
> > > >> > >> > >> > > > > > > > > > > > > to
> > > >> > >> > >> > > > > > > > > > > > > > > explain this to the end
> users, I
> > > >> find
> > > >> > it
> > > >> > >> > >> actually
> > > >> > >> > >> > > > more
> > > >> > >> > >> > > > > > > > natural
> > > >> > >> > >> > > > > > > > > > than
> > > >> > >> > >> > > > > > > > > > > > the
> > > >> > >> > >> > > > > > > > > > > > > > > request rate since as
> mentioned
> > > >> above,
> > > >> > >> > >> different
> > > >> > >> > >> > > > > requests
> > > >> > >> > >> > > > > > > > will
> > > >> > >> > >> > > > > > > > > > have
> > > >> > >> > >> > > > > > > > > > > > > quite
> > > >> > >> > >> > > > > > > > > > > > > > > different "cost", and Kafka
> > today
> > > >> > already
> > > >> > >> > have
> > > >> > >> > >> > > > various
> > > >> > >> > >> > > > > > > > request
> > > >> > >> > >> > > > > > > > > > > types
> > > >> > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin,
> > metadata,
> > > >> etc),
> > > >> > >> > >> because
> > > >> > >> > >> > of
> > > >> > >> > >> > > > that
> > > >> > >> > >> > > > > > the
> > > >> > >> > >> > > > > > > > > > request
> > > >> > >> > >> > > > > > > > > > > > > rate
> > > >> > >> > >> > > > > > > > > > > > > > > throttling may not be as
> > effective
> > > >> > >> unless it
> > > >> > >> > >> is
> > > >> > >> > >> > set
> > > >> > >> > >> > > > > very
> > > >> > >> > >> > > > > > > > > > > > > conservatively.
> > > >> > >> > >> > > > > > > > > > > > > > >
> > > >> > >> > >> > > > > > > > > > > > > > > Regarding to user reactions
> when
> > > >> they
> > > >> > are
> > > >> > >> > >> > > throttled,
> > > >> > >> > >> > > > I
> > > >> > >> > >> > > > > > > think
> > > >> > >> > >> > > > > > > > it
> > > >> > >> > >> > > > > > > > > > may
> > > >> > >> > >> > > > > > > > > > > > > > differ
> > > >> > >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
> > > >> > discovered /
> > > >> > >> > >> guided
> > > >> > >> > >> > by
> > > >> > >> > >> > > > > > looking
> > > >> > >> > >> > > > > > > > at
> > > >> > >> > >> > > > > > > > > > > > relative
> > > >> > >> > >> > > > > > > > > > > > > > > metrics. So in other words
> users
> > > >> would
> > > >> > >> not
> > > >> > >> > >> expect
> > > >> > >> > >> > > to
> > > >> > >> > >> > > > > get
> > > >> > >> > >> > > > > > > > > > additional
> > > >> > >> > >> > > > > > > > > > > > > > > information by simply being
> told
> > > >> "hey,
> > > >> > >> you
> > > >> > >> > are
> > > >> > >> > >> > > > > > throttled",
> > > >> > >> > >> > > > > > > > > which
> > > >> > >> > >> > > > > > > > > > is
> > > >> > >> > >> > > > > > > > > > > > all
> > > >> > >> > >> > > > > > > > > > > > > > > what throttling does; they
> need
> > to
> > > >> > take a
> > > >> > >> > >> > follow-up
> > > >> > >> > >> > > > > step
> > > >> > >> > >> > > > > > > and
> > > >> > >> > >> > > > > > > > > see
> > > >> > >> > >> > > > > > > > > > > > "hmm,
> > > >> > >> > >> > > > > > > > > > > > > > I'm
> > > >> > >> > >> > > > > > > > > > > > > > > throttled probably because of
> > ..",
> > > >> > which
> > > >> > >> is
> > > >> > >> > by
> > > >> > >> > >> > > > looking
> > > >> > >> > >> > > > > at
> > > >> > >> > >> > > > > > > > other
> > > >> > >> > >> > > > > > > > > > > > metric
> > > >> > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm
> > > bombarding
> > > >> the
> > > >> > >> > >> brokers
> > > >> > >> > >> > > with
> > > >> > >> > >> > > > >
> > > >>
> > > > ...
> > > >
> > > > [Message clipped]
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Rajini,

Consider modeling as n * 100% unit. For 2), the question is what's causing
the I/O threads to be saturated. It's unlikely that all users' utilization
have increased at the same. A more likely case is that a few isolated
users' utilization have increased. If so, after increasing the number of
threads, the admin just needs to adjust the quota for a few isolated users,
which is expected and is less work.

Consider modeling as 1 * 100% unit. For 1), all users' quota need to be
adjusted, which is unexpected and is more work.

So, to me, the n * 100% model seems more convenient.

As for future extension to cover network thread utilization, I was thinking
that one way is to simply model the capacity as (n + m) * 100% unit, where
n and m are the number of network and i/o threads, respectively. Then, for
each user, we can just add up the utilization in the network and the i/o
thread. If we do this, we don't need a new type of quota.

Thanks,

Jun


On Thu, Mar 2, 2017 at 12:27 PM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Jun,
>
> If we use request.percentage as the percentage used in a single I/O thread,
> the total percentage being allocated will be num.io.threads * 100 for I/O
> threads and num.network.threads * 100 for network threads. A single quota
> covering the two as a percentage wouldn't quite work if you want to
> allocate the same proportion in both cases. If we want to treat threads as
> separate units, won't we need two quota configurations regardless of
> whether we use units or percentage? Perhaps I misunderstood your
> suggestion.
>
> I think there are two cases:
>
>    1. The use case that you mentioned where an admin is adding more users
>    and decides to add more I/O threads and expects to find free quota to
>    allocate for new users.
>    2. Admin adds more I/O threads because the I/O threads are saturated and
>    there are cores available to allocate, even though the number or
>    users/clients hasn't changed.
>
> If we allocated treated I/O threads as a single unit of 100%, all user
> quotas need to be reallocated for 1). If we allocated I/O threads as n
> units with n*100%, all user quotas need to be reallocated for 2), otherwise
> some of the new threads may just not be used. Either way it should be easy
> to write a script to decrease/increase quotas by a multiple for all users.
>
> So it really boils down to which quota unit is most intuitive in terms of
> configuration. And from the discussion so far, it feels like opinion is
> divided on whether quotas should be carved out of an absolute 100% (or 1
> unit) or be relative to the number of threads (n*100% or n units).
>
>
>
> On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io> wrote:
>
> > Another way to express an absolute limit is to use request.percentage,
> but
> > treat it as the percentage used in a single request handling thread. For
> > now, the request handling threads can be just the io threads. In the
> > future, they can cover the network threads as well. This is similar to
> how
> > top reports CPU usage and may be a bit easier for people to understand.
> >
> > Thanks,
> >
> > Jun
> >
> > On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Jay,
> > >
> > > 2. Regarding request.unit vs request.percentage. I started with
> > > request.percentage too. The reasoning for request.unit is the
> following.
> > > Suppose that the capacity has been reached on a broker and the admin
> > needs
> > > to add a new user. A simple way to increase the capacity is to increase
> > the
> > > number of io threads, assuming there are still enough cores. If the
> limit
> > > is based on percentage, the additional capacity automatically gets
> > > distributed to existing users and we haven't really carved out any
> > > additional resource for the new user. Now, is it easy for a user to
> > reason
> > > about 0.1 unit vs 10%. My feeling is that both are hard and have to be
> > > configured empirically. Not sure if percentage is obviously easier to
> > > reason about.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io> wrote:
> > >
> > >> A couple of quick points:
> > >>
> > >> 1. Even though the implementation of this quota is only using io
> thread
> > >> time, i think we should call it something like "request-time". This
> will
> > >> give us flexibility to improve the implementation to cover network
> > threads
> > >> in the future and will avoid exposing internal details like our thread
> > >> pools on the server.
> > >>
> > >> 2. Jun/Roger, I get what you are trying to fix but the idea of
> > >> thread/units
> > >> is super unintuitive as a user-facing knob. I had to read the KIP like
> > >> eight times to understand this. I'm not sure that your point that
> > >> increasing the number of threads is a problem with a percentage-based
> > >> value, it really depends on whether the user thinks about the
> > "percentage
> > >> of request processing time" or "thread units". If they think "I have
> > >> allocated 10% of my request processing time to user x" then it is a
> bug
> > >> that increasing the thread count decreases that percent as it does in
> > the
> > >> current proposal. As a practical matter I think the only way to
> actually
> > >> reason about this is as a percent---I just don't believe people are
> > going
> > >> to think, "ah, 4.3 thread units, that is the right amount!". Instead I
> > >> think they have to understand this thread unit concept, figure out
> what
> > >> they have set in number of threads, compute a percent and then come up
> > >> with
> > >> the number of thread units, and these will all be wrong if that thread
> > >> count changes. I also think this ties us to throttling the I/O thread
> > >> pool,
> > >> which may not be where we want to end up.
> > >>
> > >> 3. For what it's worth I do think having a single throttle_ms field in
> > all
> > >> the responses that combines all throttling from all quotas is probably
> > the
> > >> simplest. There could be a use case for having separate fields for
> each,
> > >> but I think that is actually harder to use/monitor in the common case
> so
> > >> unless someone has a use case I think just one should be fine.
> > >>
> > >> -Jay
> > >>
> > >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> > rajinisivaram@gmail.com>
> > >> wrote:
> > >>
> > >> > I have updated the KIP based on the discussions so far.
> > >> >
> > >> >
> > >> > Regards,
> > >> >
> > >> > Rajini
> > >> >
> > >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> > >> rajinisivaram@gmail.com>
> > >> > wrote:
> > >> >
> > >> > > Thank you all for the feedback.
> > >> > >
> > >> > > Ismael #1. It makes sense not to throttle inter-broker requests
> like
> > >> > > LeaderAndIsr etc. The simplest way to ensure that clients cannot
> use
> > >> > these
> > >> > > requests to bypass quotas for DoS attacks is to ensure that ACLs
> > >> prevent
> > >> > > clients from using these requests and unauthorized requests are
> > >> included
> > >> > > towards quotas.
> > >> > >
> > >> > > Ismael #2, Jay #1 : I was thinking that these quotas can return a
> > >> > separate
> > >> > > throttle time, and all utilization based quotas could use the same
> > >> field
> > >> > > (we won't add another one for network thread utilization for
> > >> instance).
> > >> > But
> > >> > > perhaps it makes sense to keep byte rate quotas separate in
> > >> produce/fetch
> > >> > > responses to provide separate metrics? Agree with Ismael that the
> > >> name of
> > >> > > the existing field should be changed if we have two. Happy to
> switch
> > >> to a
> > >> > > single combined throttle time if that is sufficient.
> > >> > >
> > >> > > Ismael #4, #5, #6: Will update KIP. Will use dot separated name
> for
> > >> new
> > >> > > property. Replication quotas use dot separated, so it will be
> > >> consistent
> > >> > > with all properties except byte rate quotas.
> > >> > >
> > >> > > Radai: #1 Request processing time rather than request rate were
> > chosen
> > >> > > because the time per request can vary significantly between
> requests
> > >> as
> > >> > > mentioned in the discussion and KIP.
> > >> > > #2 Two separate quotas for heartbeats/regular requests feel like
> > more
> > >> > > configuration and more metrics. Since most users would set quotas
> > >> higher
> > >> > > than the expected usage and quotas are more of a safety net, a
> > single
> > >> > quota
> > >> > > should work in most cases.
> > >> > >  #3 The number of requests in purgatory is limited by the number
> of
> > >> > active
> > >> > > connections since only one request per connection will be
> throttled
> > >> at a
> > >> > > time.
> > >> > > #4 As with byte rate quotas, to use the full allocated quotas,
> > >> > > clients/users would need to use partitions that are distributed
> > across
> > >> > the
> > >> > > cluster. The alternative of using cluster-wide quotas instead of
> > >> > per-broker
> > >> > > quotas would be far too complex to implement.
> > >> > >
> > >> > > Dong : We currently have two ClientQuotaManagers for quota types
> > Fetch
> > >> > and
> > >> > > Produce. A new one will be added for IOThread, which manages
> quotas
> > >> for
> > >> > I/O
> > >> > > thread utilization. This will not update the Fetch or Produce
> > >> queue-size,
> > >> > > but will have a separate metric for the queue-size.  I wasn't
> > >> planning to
> > >> > > add any additional metrics apart from the equivalent ones for
> > existing
> > >> > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
> > >> utilization
> > >> > > could be slightly misleading since it depends on the sequence of
> > >> > requests.
> > >> > > But we can look into more metrics after the KIP is implemented if
> > >> > required.
> > >> > >
> > >> > > I think we need to limit the maximum delay since all requests are
> > >> > > throttled. If a client has a quota of 0.001 units and a single
> > request
> > >> > used
> > >> > > 50ms, we don't want to delay all requests from the client by 50
> > >> seconds,
> > >> > > throwing the client out of all its consumer groups. The issue is
> > only
> > >> if
> > >> > a
> > >> > > user is allocated a quota that is insufficient to process one
> large
> > >> > > request. The expectation is that the units allocated per user will
> > be
> > >> > much
> > >> > > higher than the time taken to process one request and the limit
> > should
> > >> > > seldom be applied. Agree this needs proper documentation.
> > >> > >
> > >> > > Regards,
> > >> > >
> > >> > > Rajini
> > >> > >
> > >> > >
> > >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <
> radai.rosenblatt@gmail.com>
> > >> > wrote:
> > >> > >
> > >> > >> @jun: i wasnt concerned about tying up a request processing
> thread,
> > >> but
> > >> > >> IIUC the code does still read the entire request out, which might
> > >> add-up
> > >> > >> to
> > >> > >> a non-negligible amount of memory.
> > >> > >>
> > >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com>
> > >> wrote:
> > >> > >>
> > >> > >> > Hey Rajini,
> > >> > >> >
> > >> > >> > The current KIP says that the maximum delay will be reduced to
> > >> window
> > >> > >> size
> > >> > >> > if it is larger than the window size. I have a concern with
> this:
> > >> > >> >
> > >> > >> > 1) This essentially means that the user is allowed to exceed
> > their
> > >> > quota
> > >> > >> > over a long period of time. Can you provide an upper bound on
> > this
> > >> > >> > deviation?
> > >> > >> >
> > >> > >> > 2) What is the motivation for cap the maximum delay by the
> window
> > >> > size?
> > >> > >> I
> > >> > >> > am wondering if there is better alternative to address the
> > problem.
> > >> > >> >
> > >> > >> > 3) It means that the existing metric-related config will have a
> > >> more
> > >> > >> > directly impact on the mechanism of this io-thread-unit-based
> > >> quota.
> > >> > The
> > >> > >> > may be an important change depending on the answer to 1) above.
> > We
> > >> > >> probably
> > >> > >> > need to document this more explicitly.
> > >> > >> >
> > >> > >> > Dong
> > >> > >> >
> > >> > >> >
> > >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <
> lindong28@gmail.com>
> > >> > wrote:
> > >> > >> >
> > >> > >> > > Hey Jun,
> > >> > >> > >
> > >> > >> > > Yeah you are right. I thought it wasn't because at LinkedIn
> it
> > >> will
> > >> > be
> > >> > >> > too
> > >> > >> > > much pressure on inGraph to expose those per-clientId metrics
> > so
> > >> we
> > >> > >> ended
> > >> > >> > > up printing them periodically to local log. Never mind if it
> is
> > >> not
> > >> > a
> > >> > >> > > general problem.
> > >> > >> > >
> > >> > >> > > Hey Rajini,
> > >> > >> > >
> > >> > >> > > - I agree with Jay that we probably don't want to add a new
> > field
> > >> > for
> > >> > >> > > every quota ProduceResponse or FetchResponse. Is there any
> > >> use-case
> > >> > >> for
> > >> > >> > > having separate throttle-time fields for byte-rate-quota and
> > >> > >> > > io-thread-unit-quota? You probably need to document this as
> > >> > interface
> > >> > >> > > change if you plan to add new field in any request.
> > >> > >> > >
> > >> > >> > > - I don't think IOThread belongs to quotaType. The existing
> > quota
> > >> > >> types
> > >> > >> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication)
> > >> identify
> > >> > >> the
> > >> > >> > > type of request that are throttled, not the quota mechanism
> > that
> > >> is
> > >> > >> > applied.
> > >> > >> > >
> > >> > >> > > - If a request is throttled due to this io-thread-unit-based
> > >> quota,
> > >> > is
> > >> > >> > the
> > >> > >> > > existing queue-size metric in ClientQuotaManager incremented?
> > >> > >> > >
> > >> > >> > > - In the interest of providing guide line for admin to decide
> > >> > >> > > io-thread-unit-based quota and for user to understand its
> > impact
> > >> on
> > >> > >> their
> > >> > >> > > traffic, would it be useful to have a metric that shows the
> > >> overall
> > >> > >> > > byte-rate per io-thread-unit? Can we also show this a
> > >> per-clientId
> > >> > >> > metric?
> > >> > >> > >
> > >> > >> > > Thanks,
> > >> > >> > > Dong
> > >> > >> > >
> > >> > >> > >
> > >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io>
> > >> wrote:
> > >> > >> > >
> > >> > >> > >> Hi, Ismael,
> > >> > >> > >>
> > >> > >> > >> For #3, typically, an admin won't configure more io threads
> > than
> > >> > CPU
> > >> > >> > >> cores,
> > >> > >> > >> but it's possible for an admin to start with fewer io
> threads
> > >> than
> > >> > >> cores
> > >> > >> > >> and grow that later on.
> > >> > >> > >>
> > >> > >> > >> Hi, Dong,
> > >> > >> > >>
> > >> > >> > >> I think the throttleTime sensor on the broker tells the
> admin
> > >> > >> whether a
> > >> > >> > >> user/clentId is throttled or not.
> > >> > >> > >>
> > >> > >> > >> Hi, Radi,
> > >> > >> > >>
> > >> > >> > >> The reasoning for delaying the throttled requests on the
> > broker
> > >> > >> instead
> > >> > >> > of
> > >> > >> > >> returning an error immediately is that the latter has no way
> > to
> > >> > >> prevent
> > >> > >> > >> the
> > >> > >> > >> client from retrying immediately, which will make things
> > worse.
> > >> The
> > >> > >> > >> delaying logic is based off a delay queue. A separate
> > expiration
> > >> > >> thread
> > >> > >> > >> just waits on the next to be expired request. So, it doesn't
> > tie
> > >> > up a
> > >> > >> > >> request handler thread.
> > >> > >> > >>
> > >> > >> > >> Thanks,
> > >> > >> > >>
> > >> > >> > >> Jun
> > >> > >> > >>
> > >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> > ismael@juma.me.uk
> > >> >
> > >> > >> wrote:
> > >> > >> > >>
> > >> > >> > >> > Hi Jay,
> > >> > >> > >> >
> > >> > >> > >> > Regarding 1, I definitely like the simplicity of keeping a
> > >> single
> > >> > >> > >> throttle
> > >> > >> > >> > time field in the response. The downside is that the
> client
> > >> > metrics
> > >> > >> > >> will be
> > >> > >> > >> > more coarse grained.
> > >> > >> > >> >
> > >> > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.
> > percentage`
> > >> > and
> > >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > >> > >> > >> >
> > >> > >> > >> > Ismael
> > >> > >> > >> >
> > >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> > jay@confluent.io>
> > >> > >> wrote:
> > >> > >> > >> >
> > >> > >> > >> > > A few minor comments:
> > >> > >> > >> > >
> > >> > >> > >> > >    1. Isn't it the case that the throttling time
> response
> > >> field
> > >> > >> > should
> > >> > >> > >> > have
> > >> > >> > >> > >    the total time your request was throttled
> irrespective
> > of
> > >> > the
> > >> > >> > >> quotas
> > >> > >> > >> > > that
> > >> > >> > >> > >    caused that. Limiting it to byte rate quota doesn't
> > make
> > >> > >> sense,
> > >> > >> > >> but I
> > >> > >> > >> > > also
> > >> > >> > >> > >    I don't think we want to end up adding new fields in
> > the
> > >> > >> response
> > >> > >> > >> for
> > >> > >> > >> > > every
> > >> > >> > >> > >    single thing we quota, right?
> > >> > >> > >> > >    2. I don't think we should make this quota
> specifically
> > >> > about
> > >> > >> io
> > >> > >> > >> > >    threads. Once we introduce these quotas people set
> them
> > >> and
> > >> > >> > expect
> > >> > >> > >> > them
> > >> > >> > >> > > to
> > >> > >> > >> > >    be enforced (and if they aren't it may cause an
> > outage).
> > >> As
> > >> > a
> > >> > >> > >> result
> > >> > >> > >> > > they
> > >> > >> > >> > >    are a bit more sensitive than normal configs, I
> think.
> > >> The
> > >> > >> > current
> > >> > >> > >> > > thread
> > >> > >> > >> > >    pools seem like something of an implementation detail
> > and
> > >> > not
> > >> > >> the
> > >> > >> > >> > level
> > >> > >> > >> > > the
> > >> > >> > >> > >    user-facing quotas should be involved with. I think
> it
> > >> might
> > >> > >> be
> > >> > >> > >> better
> > >> > >> > >> > > to
> > >> > >> > >> > >    make this a general request-time throttle with no
> > >> mention in
> > >> > >> the
> > >> > >> > >> > naming
> > >> > >> > >> > >    about I/O threads and simply acknowledge the current
> > >> > >> limitation
> > >> > >> > >> (which
> > >> > >> > >> > > we
> > >> > >> > >> > >    may someday fix) in the docs that this covers only
> the
> > >> time
> > >> > >> after
> > >> > >> > >> the
> > >> > >> > >> > >    thread is read off the network.
> > >> > >> > >> > >    3. As such I think the right interface to the user
> > would
> > >> be
> > >> > >> > >> something
> > >> > >> > >> > >    like percent_request_time and be in {0,...100} or
> > >> > >> > >> request_time_ratio
> > >> > >> > >> > > and be
> > >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology
> we
> > >> used
> > >> > >> if
> > >> > >> > the
> > >> > >> > >> > > scale
> > >> > >> > >> > >    is between 0 and 1 in the other metrics, right?)
> > >> > >> > >> > >
> > >> > >> > >> > > -Jay
> > >> > >> > >> > >
> > >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > >> > >> > >> rajinisivaram@gmail.com
> > >> > >> > >> > >
> > >> > >> > >> > > wrote:
> > >> > >> > >> > >
> > >> > >> > >> > > > Guozhang/Dong,
> > >> > >> > >> > > >
> > >> > >> > >> > > > Thank you for the feedback.
> > >> > >> > >> > > >
> > >> > >> > >> > > > Guozhang : I have updated the section on co-existence
> of
> > >> byte
> > >> > >> rate
> > >> > >> > >> and
> > >> > >> > >> > > > request time quotas.
> > >> > >> > >> > > >
> > >> > >> > >> > > > Dong: I hadn't added much detail to the metrics and
> > >> sensors
> > >> > >> since
> > >> > >> > >> they
> > >> > >> > >> > > are
> > >> > >> > >> > > > going to be very similar to the existing metrics and
> > >> sensors.
> > >> > >> To
> > >> > >> > >> avoid
> > >> > >> > >> > > > confusion, I have now added more detail. All metrics
> are
> > >> in
> > >> > the
> > >> > >> > >> group
> > >> > >> > >> > > > "quotaType" and all sensors have names starting with
> > >> > >> "quotaType"
> > >> > >> > >> (where
> > >> > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > >> > >> > >> > > > FollowerReplication/*IOThread*).
> > >> > >> > >> > > > So there will be no reuse of existing metrics/sensors.
> > The
> > >> > new
> > >> > >> > ones
> > >> > >> > >> for
> > >> > >> > >> > > > request processing time based throttling will be
> > >> completely
> > >> > >> > >> independent
> > >> > >> > >> > > of
> > >> > >> > >> > > > existing metrics/sensors, but will be consistent in
> > >> format.
> > >> > >> > >> > > >
> > >> > >> > >> > > > The existing throttle_time_ms field in produce/fetch
> > >> > responses
> > >> > >> > will
> > >> > >> > >> not
> > >> > >> > >> > > be
> > >> > >> > >> > > > impacted by this KIP. That will continue to return
> > >> byte-rate
> > >> > >> based
> > >> > >> > >> > > > throttling times. In addition, a new field
> > >> > >> > request_throttle_time_ms
> > >> > >> > >> > will
> > >> > >> > >> > > be
> > >> > >> > >> > > > added to return request quota based throttling times.
> > >> These
> > >> > >> will
> > >> > >> > be
> > >> > >> > >> > > exposed
> > >> > >> > >> > > > as new metrics on the client-side.
> > >> > >> > >> > > >
> > >> > >> > >> > > > Since all metrics and sensors are different for each
> > type
> > >> of
> > >> > >> > quota,
> > >> > >> > >> I
> > >> > >> > >> > > > believe there is already sufficient metrics to monitor
> > >> > >> throttling
> > >> > >> > on
> > >> > >> > >> > both
> > >> > >> > >> > > > client and broker side for each type of throttling.
> > >> > >> > >> > > >
> > >> > >> > >> > > > Regards,
> > >> > >> > >> > > >
> > >> > >> > >> > > > Rajini
> > >> > >> > >> > > >
> > >> > >> > >> > > >
> > >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > >> > lindong28@gmail.com
> > >> > >> >
> > >> > >> > >> wrote:
> > >> > >> > >> > > >
> > >> > >> > >> > > > > Hey Rajini,
> > >> > >> > >> > > > >
> > >> > >> > >> > > > > I think it makes a lot of sense to use
> io_thread_units
> > >> as
> > >> > >> metric
> > >> > >> > >> to
> > >> > >> > >> > > quota
> > >> > >> > >> > > > > user's traffic here. LGTM overall. I have some
> > questions
> > >> > >> > regarding
> > >> > >> > >> > > > sensors.
> > >> > >> > >> > > > >
> > >> > >> > >> > > > > - Can you be more specific in the KIP what sensors
> > will
> > >> be
> > >> > >> > added?
> > >> > >> > >> For
> > >> > >> > >> > > > > example, it will be useful to specify the name and
> > >> > >> attributes of
> > >> > >> > >> > these
> > >> > >> > >> > > > new
> > >> > >> > >> > > > > sensors.
> > >> > >> > >> > > > >
> > >> > >> > >> > > > > - We currently have throttle-time and queue-size for
> > >> > >> byte-rate
> > >> > >> > >> based
> > >> > >> > >> > > > quota.
> > >> > >> > >> > > > > Are you going to have separate throttle-time and
> > >> queue-size
> > >> > >> for
> > >> > >> > >> > > requests
> > >> > >> > >> > > > > throttled by io_thread_unit-based quota, or will
> they
> > >> share
> > >> > >> the
> > >> > >> > >> same
> > >> > >> > >> > > > > sensor?
> > >> > >> > >> > > > >
> > >> > >> > >> > > > > - Does the throttle-time in the ProduceResponse and
> > >> > >> > FetchResponse
> > >> > >> > >> > > > contains
> > >> > >> > >> > > > > time due to io_thread_unit-based quota?
> > >> > >> > >> > > > >
> > >> > >> > >> > > > > - Currently kafka server doesn't not provide any log
> > or
> > >> > >> metrics
> > >> > >> > >> that
> > >> > >> > >> > > > tells
> > >> > >> > >> > > > > whether any given clientId (or user) is throttled.
> > This
> > >> is
> > >> > >> not
> > >> > >> > too
> > >> > >> > >> > bad
> > >> > >> > >> > > > > because we can still check the client-side byte-rate
> > >> metric
> > >> > >> to
> > >> > >> > >> > validate
> > >> > >> > >> > > > > whether a given client is throttled. But with this
> > >> > >> > io_thread_unit,
> > >> > >> > >> > > there
> > >> > >> > >> > > > > will be no way to validate whether a given client is
> > >> slow
> > >> > >> > because
> > >> > >> > >> it
> > >> > >> > >> > > has
> > >> > >> > >> > > > > exceeded its io_thread_unit limit. It is necessary
> for
> > >> user
> > >> > >> to
> > >> > >> > be
> > >> > >> > >> > able
> > >> > >> > >> > > to
> > >> > >> > >> > > > > know this information to figure how whether they
> have
> > >> > reached
> > >> > >> > >> there
> > >> > >> > >> > > quota
> > >> > >> > >> > > > > limit. How about we add log4j log on the server side
> > to
> > >> > >> > >> periodically
> > >> > >> > >> > > > print
> > >> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > >> > >> > >> > io-thread-unit-throttle-time)
> > >> > >> > >> > > so
> > >> > >> > >> > > > > that kafka administrator can figure those users that
> > >> have
> > >> > >> > reached
> > >> > >> > >> > their
> > >> > >> > >> > > > > limit and act accordingly?
> > >> > >> > >> > > > >
> > >> > >> > >> > > > > Thanks,
> > >> > >> > >> > > > > Dong
> > >> > >> > >> > > > >
> > >> > >> > >> > > > >
> > >> > >> > >> > > > >
> > >> > >> > >> > > > >
> > >> > >> > >> > > > >
> > >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> > >> > >> > >> wangguoz@gmail.com>
> > >> > >> > >> > > > wrote:
> > >> > >> > >> > > > >
> > >> > >> > >> > > > > > Made a pass over the doc, overall LGTM except a
> > minor
> > >> > >> comment
> > >> > >> > on
> > >> > >> > >> > the
> > >> > >> > >> > > > > > throttling implementation:
> > >> > >> > >> > > > > >
> > >> > >> > >> > > > > > Stated as "Request processing time throttling will
> > be
> > >> > >> applied
> > >> > >> > on
> > >> > >> > >> > top
> > >> > >> > >> > > if
> > >> > >> > >> > > > > > necessary." I thought that it meant the request
> > >> > processing
> > >> > >> > time
> > >> > >> > >> > > > > throttling
> > >> > >> > >> > > > > > is applied first, but continue reading I found it
> > >> > actually
> > >> > >> > >> meant to
> > >> > >> > >> > > > apply
> > >> > >> > >> > > > > > produce / fetch byte rate throttling first.
> > >> > >> > >> > > > > >
> > >> > >> > >> > > > > > Also the last sentence "The remaining delay if any
> > is
> > >> > >> applied
> > >> > >> > to
> > >> > >> > >> > the
> > >> > >> > >> > > > > > response." is a bit confusing to me. Maybe
> rewording
> > >> it a
> > >> > >> bit?
> > >> > >> > >> > > > > >
> > >> > >> > >> > > > > >
> > >> > >> > >> > > > > > Guozhang
> > >> > >> > >> > > > > >
> > >> > >> > >> > > > > >
> > >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> > >> > jun@confluent.io
> > >> > >> >
> > >> > >> > >> wrote:
> > >> > >> > >> > > > > >
> > >> > >> > >> > > > > > > Hi, Rajini,
> > >> > >> > >> > > > > > >
> > >> > >> > >> > > > > > > Thanks for the updated KIP. The latest proposal
> > >> looks
> > >> > >> good
> > >> > >> > to
> > >> > >> > >> me.
> > >> > >> > >> > > > > > >
> > >> > >> > >> > > > > > > Jun
> > >> > >> > >> > > > > > >
> > >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram
> <
> > >> > >> > >> > > > > rajinisivaram@gmail.com
> > >> > >> > >> > > > > > >
> > >> > >> > >> > > > > > > wrote:
> > >> > >> > >> > > > > > >
> > >> > >> > >> > > > > > > > Jun/Roger,
> > >> > >> > >> > > > > > > >
> > >> > >> > >> > > > > > > > Thank you for the feedback.
> > >> > >> > >> > > > > > > >
> > >> > >> > >> > > > > > > > 1. I have updated the KIP to use absolute
> units
> > >> > >> instead of
> > >> > >> > >> > > > > percentage.
> > >> > >> > >> > > > > > > The
> > >> > >> > >> > > > > > > > property is called* io_thread_units* to align
> > with
> > >> > the
> > >> > >> > >> thread
> > >> > >> > >> > > count
> > >> > >> > >> > > > > > > > property *num.io.threads*. When we implement
> > >> network
> > >> > >> > thread
> > >> > >> > >> > > > > utilization
> > >> > >> > >> > > > > > > > quotas, we can add another property
> > >> > >> > *network_thread_units.*
> > >> > >> > >> > > > > > > >
> > >> > >> > >> > > > > > > > 2. ControlledShutdown is already listed under
> > the
> > >> > >> exempt
> > >> > >> > >> > > requests.
> > >> > >> > >> > > > > Jun,
> > >> > >> > >> > > > > > > did
> > >> > >> > >> > > > > > > > you mean a different request that needs to be
> > >> added?
> > >> > >> The
> > >> > >> > >> four
> > >> > >> > >> > > > > requests
> > >> > >> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> > >> > >> > >> > ControlledShutdown,
> > >> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are
> > >> controlled
> > >> > >> > using
> > >> > >> > >> > > > > > ClusterAction
> > >> > >> > >> > > > > > > > ACL, so it is easy to exclude and only
> throttle
> > if
> > >> > >> > >> > unauthorized.
> > >> > >> > >> > > I
> > >> > >> > >> > > > > > wasn't
> > >> > >> > >> > > > > > > > sure if there are other requests used only for
> > >> > >> > inter-broker
> > >> > >> > >> > that
> > >> > >> > >> > > > > needed
> > >> > >> > >> > > > > > > to
> > >> > >> > >> > > > > > > > be excluded.
> > >> > >> > >> > > > > > > >
> > >> > >> > >> > > > > > > > 3. I was thinking the smallest change would be
> > to
> > >> > >> replace
> > >> > >> > >> all
> > >> > >> > >> > > > > > references
> > >> > >> > >> > > > > > > to
> > >> > >> > >> > > > > > > > *requestChannel.sendResponse()* with a local
> > >> method
> > >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the
> > >> > throttling
> > >> > >> if
> > >> > >> > >> any
> > >> > >> > >> > > plus
> > >> > >> > >> > > > > send
> > >> > >> > >> > > > > > > > response. If we throttle first in
> > >> > *KafkaApis.handle()*,
> > >> > >> > the
> > >> > >> > >> > time
> > >> > >> > >> > > > > spent
> > >> > >> > >> > > > > > > > within the method handling the request will
> not
> > be
> > >> > >> > recorded
> > >> > >> > >> or
> > >> > >> > >> > > used
> > >> > >> > >> > > > > in
> > >> > >> > >> > > > > > > > throttling. We can look into this again when
> the
> > >> PR
> > >> > is
> > >> > >> > ready
> > >> > >> > >> > for
> > >> > >> > >> > > > > > review.
> > >> > >> > >> > > > > > > >
> > >> > >> > >> > > > > > > > Regards,
> > >> > >> > >> > > > > > > >
> > >> > >> > >> > > > > > > > Rajini
> > >> > >> > >> > > > > > > >
> > >> > >> > >> > > > > > > >
> > >> > >> > >> > > > > > > >
> > >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover
> <
> > >> > >> > >> > > > > roger.hoover@gmail.com>
> > >> > >> > >> > > > > > > > wrote:
> > >> > >> > >> > > > > > > >
> > >> > >> > >> > > > > > > > > Great to see this KIP and the excellent
> > >> discussion.
> > >> > >> > >> > > > > > > > >
> > >> > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
> > >> > >> application
> > >> > >> > is
> > >> > >> > >> > > > > allocated
> > >> > >> > >> > > > > > 1
> > >> > >> > >> > > > > > > > > request handler unit, then it's as if I
> have a
> > >> > Kafka
> > >> > >> > >> broker
> > >> > >> > >> > > with
> > >> > >> > >> > > > a
> > >> > >> > >> > > > > > > single
> > >> > >> > >> > > > > > > > > request handler thread dedicated to me.
> > That's
> > >> the
> > >> > >> > most I
> > >> > >> > >> > can
> > >> > >> > >> > > > use,
> > >> > >> > >> > > > > > at
> > >> > >> > >> > > > > > > > > least.  That allocation doesn't change even
> if
> > >> an
> > >> > >> admin
> > >> > >> > >> later
> > >> > >> > >> > > > > > increases
> > >> > >> > >> > > > > > > > the
> > >> > >> > >> > > > > > > > > size of the request thread pool on the
> broker.
> > >> > It's
> > >> > >> > >> similar
> > >> > >> > >> > to
> > >> > >> > >> > > > the
> > >> > >> > >> > > > > > CPU
> > >> > >> > >> > > > > > > > > abstraction that VMs and containers get from
> > >> > >> hypervisors
> > >> > >> > >> or
> > >> > >> > >> > OS
> > >> > >> > >> > > > > > > > schedulers.
> > >> > >> > >> > > > > > > > > While different client access patterns can
> use
> > >> > wildly
> > >> > >> > >> > different
> > >> > >> > >> > > > > > amounts
> > >> > >> > >> > > > > > > > of
> > >> > >> > >> > > > > > > > > request thread resources per request, a
> given
> > >> > >> > application
> > >> > >> > >> > will
> > >> > >> > >> > > > > > > generally
> > >> > >> > >> > > > > > > > > have a stable access pattern and can figure
> > out
> > >> > >> > >> empirically
> > >> > >> > >> > how
> > >> > >> > >> > > > > many
> > >> > >> > >> > > > > > > > > "request thread units" it needs to meet it's
> > >> > >> > >> > throughput/latency
> > >> > >> > >> > > > > > goals.
> > >> > >> > >> > > > > > > > >
> > >> > >> > >> > > > > > > > > Cheers,
> > >> > >> > >> > > > > > > > >
> > >> > >> > >> > > > > > > > > Roger
> > >> > >> > >> > > > > > > > >
> > >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> > >> > >> > >> jun@confluent.io>
> > >> > >> > >> > > > wrote:
> > >> > >> > >> > > > > > > > >
> > >> > >> > >> > > > > > > > > > Hi, Rajini,
> > >> > >> > >> > > > > > > > > >
> > >> > >> > >> > > > > > > > > > Thanks for the updated KIP. A few more
> > >> comments.
> > >> > >> > >> > > > > > > > > >
> > >> > >> > >> > > > > > > > > > 1. A concern of request_time_percent is
> that
> > >> it's
> > >> > >> not
> > >> > >> > an
> > >> > >> > >> > > > absolute
> > >> > >> > >> > > > > > > > value.
> > >> > >> > >> > > > > > > > > > Let's say you give a user a 10% limit. If
> > the
> > >> > admin
> > >> > >> > >> doubles
> > >> > >> > >> > > the
> > >> > >> > >> > > > > > > number
> > >> > >> > >> > > > > > > > of
> > >> > >> > >> > > > > > > > > > request handler threads, that user now
> > >> actually
> > >> > has
> > >> > >> > >> twice
> > >> > >> > >> > the
> > >> > >> > >> > > > > > > absolute
> > >> > >> > >> > > > > > > > > > capacity. This may confuse people a bit.
> So,
> > >> > >> perhaps
> > >> > >> > >> > setting
> > >> > >> > >> > > > the
> > >> > >> > >> > > > > > > quota
> > >> > >> > >> > > > > > > > > > based on an absolute request thread unit
> is
> > >> > better.
> > >> > >> > >> > > > > > > > > >
> > >> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
> > >> > >> inter-broker
> > >> > >> > >> > request
> > >> > >> > >> > > > and
> > >> > >> > >> > > > > > > needs
> > >> > >> > >> > > > > > > > to
> > >> > >> > >> > > > > > > > > > be excluded from throttling.
> > >> > >> > >> > > > > > > > > >
> > >> > >> > >> > > > > > > > > > 3. Implementation wise, I am wondering if
> > it's
> > >> > >> simpler
> > >> > >> > >> to
> > >> > >> > >> > > apply
> > >> > >> > >> > > > > the
> > >> > >> > >> > > > > > > > > request
> > >> > >> > >> > > > > > > > > > time throttling first in
> KafkaApis.handle().
> > >> > >> > Otherwise,
> > >> > >> > >> we
> > >> > >> > >> > > will
> > >> > >> > >> > > > > > need
> > >> > >> > >> > > > > > > to
> > >> > >> > >> > > > > > > > > add
> > >> > >> > >> > > > > > > > > > the throttling logic in each type of
> > request.
> > >> > >> > >> > > > > > > > > >
> > >> > >> > >> > > > > > > > > > Thanks,
> > >> > >> > >> > > > > > > > > >
> > >> > >> > >> > > > > > > > > > Jun
> > >> > >> > >> > > > > > > > > >
> > >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini
> > >> Sivaram <
> > >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > >> > >> > >> > > > > > > > > >
> > >> > >> > >> > > > > > > > > > wrote:
> > >> > >> > >> > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > Jun,
> > >> > >> > >> > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > Thank you for the review.
> > >> > >> > >> > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > I have reverted to the original KIP that
> > >> > >> throttles
> > >> > >> > >> based
> > >> > >> > >> > on
> > >> > >> > >> > > > > > request
> > >> > >> > >> > > > > > > > > > handler
> > >> > >> > >> > > > > > > > > > > utilization. At the moment, it uses
> > >> percentage,
> > >> > >> but
> > >> > >> > I
> > >> > >> > >> am
> > >> > >> > >> > > > happy
> > >> > >> > >> > > > > to
> > >> > >> > >> > > > > > > > > change
> > >> > >> > >> > > > > > > > > > to
> > >> > >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
> > >> > >> required. I
> > >> > >> > >> have
> > >> > >> > >> > > > added
> > >> > >> > >> > > > > > the
> > >> > >> > >> > > > > > > > > > examples
> > >> > >> > >> > > > > > > > > > > from this discussion to the KIP. Also
> > added
> > >> a
> > >> > >> > "Future
> > >> > >> > >> > Work"
> > >> > >> > >> > > > > > section
> > >> > >> > >> > > > > > > > to
> > >> > >> > >> > > > > > > > > > > address network thread utilization. The
> > >> > >> > configuration
> > >> > >> > >> is
> > >> > >> > >> > > > named
> > >> > >> > >> > > > > > > > > > > "request_time_percent" with the
> > expectation
> > >> > that
> > >> > >> it
> > >> > >> > >> can
> > >> > >> > >> > > also
> > >> > >> > >> > > > be
> > >> > >> > >> > > > > > > used
> > >> > >> > >> > > > > > > > as
> > >> > >> > >> > > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > limit for network thread utilization
> when
> > >> that
> > >> > is
> > >> > >> > >> > > > implemented,
> > >> > >> > >> > > > > so
> > >> > >> > >> > > > > > > > that
> > >> > >> > >> > > > > > > > > > > users have to set only one config for
> the
> > >> two
> > >> > and
> > >> > >> > not
> > >> > >> > >> > have
> > >> > >> > >> > > to
> > >> > >> > >> > > > > > worry
> > >> > >> > >> > > > > > > > > about
> > >> > >> > >> > > > > > > > > > > the internal distribution of the work
> > >> between
> > >> > the
> > >> > >> > two
> > >> > >> > >> > > thread
> > >> > >> > >> > > > > > pools
> > >> > >> > >> > > > > > > in
> > >> > >> > >> > > > > > > > > > > Kafka.
> > >> > >> > >> > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > Regards,
> > >> > >> > >> > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > Rajini
> > >> > >> > >> > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun
> Rao
> > <
> > >> > >> > >> > > jun@confluent.io>
> > >> > >> > >> > > > > > > wrote:
> > >> > >> > >> > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > The benefit of using the request
> > >> processing
> > >> > >> time
> > >> > >> > >> over
> > >> > >> > >> > the
> > >> > >> > >> > > > > > request
> > >> > >> > >> > > > > > > > > rate
> > >> > >> > >> > > > > > > > > > is
> > >> > >> > >> > > > > > > > > > > > exactly what people have said. I will
> > just
> > >> > >> expand
> > >> > >> > >> that
> > >> > >> > >> > a
> > >> > >> > >> > > > bit.
> > >> > >> > >> > > > > > > > > Consider
> > >> > >> > >> > > > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > > following case. The producer sends a
> > >> produce
> > >> > >> > request
> > >> > >> > >> > > with a
> > >> > >> > >> > > > > > 10MB
> > >> > >> > >> > > > > > > > > > message
> > >> > >> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> > >> > >> > >> decompression of
> > >> > >> > >> > > the
> > >> > >> > >> > > > > > > message
> > >> > >> > >> > > > > > > > > on
> > >> > >> > >> > > > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > > broker could take 10-15 seconds,
> during
> > >> which
> > >> > >> > time,
> > >> > >> > >> a
> > >> > >> > >> > > > request
> > >> > >> > >> > > > > > > > handler
> > >> > >> > >> > > > > > > > > > > > thread is completely blocked. In this
> > >> case,
> > >> > >> > neither
> > >> > >> > >> the
> > >> > >> > >> > > > > byte-in
> > >> > >> > >> > > > > > > > quota
> > >> > >> > >> > > > > > > > > > nor
> > >> > >> > >> > > > > > > > > > > > the request rate quota may be
> effective
> > in
> > >> > >> > >> protecting
> > >> > >> > >> > the
> > >> > >> > >> > > > > > broker.
> > >> > >> > >> > > > > > > > > > > Consider
> > >> > >> > >> > > > > > > > > > > > another case. A consumer group starts
> > >> with 10
> > >> > >> > >> instances
> > >> > >> > >> > > and
> > >> > >> > >> > > > > > later
> > >> > >> > >> > > > > > > > on
> > >> > >> > >> > > > > > > > > > > > switches to 20 instances. The request
> > rate
> > >> > will
> > >> > >> > >> likely
> > >> > >> > >> > > > > double,
> > >> > >> > >> > > > > > > but
> > >> > >> > >> > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > > actually load on the broker may not
> > double
> > >> > >> since
> > >> > >> > >> each
> > >> > >> > >> > > fetch
> > >> > >> > >> > > > > > > request
> > >> > >> > >> > > > > > > > > > only
> > >> > >> > >> > > > > > > > > > > > contains half of the partitions.
> Request
> > >> rate
> > >> > >> > quota
> > >> > >> > >> may
> > >> > >> > >> > > not
> > >> > >> > >> > > > > be
> > >> > >> > >> > > > > > > easy
> > >> > >> > >> > > > > > > > > to
> > >> > >> > >> > > > > > > > > > > > configure in this case.
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > What we really want is to be able to
> > >> prevent
> > >> > a
> > >> > >> > >> client
> > >> > >> > >> > > from
> > >> > >> > >> > > > > > using
> > >> > >> > >> > > > > > > > too
> > >> > >> > >> > > > > > > > > > much
> > >> > >> > >> > > > > > > > > > > > of the server side resources. In this
> > >> > >> particular
> > >> > >> > >> KIP,
> > >> > >> > >> > > this
> > >> > >> > >> > > > > > > resource
> > >> > >> > >> > > > > > > > > is
> > >> > >> > >> > > > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > > capacity of the request handler
> > threads. I
> > >> > >> agree
> > >> > >> > >> that
> > >> > >> > >> > it
> > >> > >> > >> > > > may
> > >> > >> > >> > > > > > not
> > >> > >> > >> > > > > > > be
> > >> > >> > >> > > > > > > > > > > > intuitive for the users to determine
> how
> > >> to
> > >> > set
> > >> > >> > the
> > >> > >> > >> > right
> > >> > >> > >> > > > > > limit.
> > >> > >> > >> > > > > > > > > > However,
> > >> > >> > >> > > > > > > > > > > > this is not completely new and has
> been
> > >> done
> > >> > in
> > >> > >> > the
> > >> > >> > >> > > > container
> > >> > >> > >> > > > > > > world
> > >> > >> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> > >> > >> > >> > > > > https://access.redhat.com/
> > >> > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > >> > >> > >> terprise_Linux/6/html/
> > >> > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-
> cpu.html)
> > >> has
> > >> > >> the
> > >> > >> > >> > concept
> > >> > >> > >> > > of
> > >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > >> > >> > >> > > > > > > > > > > > which specifies the total amount of
> time
> > >> in
> > >> > >> > >> > microseconds
> > >> > >> > >> > > > for
> > >> > >> > >> > > > > > > which
> > >> > >> > >> > > > > > > > > all
> > >> > >> > >> > > > > > > > > > > > tasks in a cgroup can run during a one
> > >> second
> > >> > >> > >> period.
> > >> > >> > >> > We
> > >> > >> > >> > > > can
> > >> > >> > >> > > > > > > > > > potentially
> > >> > >> > >> > > > > > > > > > > > model the request handler threads in a
> > >> > similar
> > >> > >> > way.
> > >> > >> > >> For
> > >> > >> > >> > > > > > example,
> > >> > >> > >> > > > > > > > each
> > >> > >> > >> > > > > > > > > > > > request handler thread can be 1
> request
> > >> > handler
> > >> > >> > unit
> > >> > >> > >> > and
> > >> > >> > >> > > > the
> > >> > >> > >> > > > > > > admin
> > >> > >> > >> > > > > > > > > can
> > >> > >> > >> > > > > > > > > > > > configure a limit on how many units
> (say
> > >> > 0.01)
> > >> > >> a
> > >> > >> > >> client
> > >> > >> > >> > > can
> > >> > >> > >> > > > > > have.
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > Regarding not throttling the internal
> > >> broker
> > >> > to
> > >> > >> > >> broker
> > >> > >> > >> > > > > > requests.
> > >> > >> > >> > > > > > > We
> > >> > >> > >> > > > > > > > > > could
> > >> > >> > >> > > > > > > > > > > > do that. Alternatively, we could just
> > let
> > >> the
> > >> > >> > admin
> > >> > >> > >> > > > > configure a
> > >> > >> > >> > > > > > > > high
> > >> > >> > >> > > > > > > > > > > limit
> > >> > >> > >> > > > > > > > > > > > for the kafka user (it may not be able
> > to
> > >> do
> > >> > >> that
> > >> > >> > >> > easily
> > >> > >> > >> > > > > based
> > >> > >> > >> > > > > > on
> > >> > >> > >> > > > > > > > > > > clientId
> > >> > >> > >> > > > > > > > > > > > though).
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > Ideally we want to be able to protect
> > the
> > >> > >> > >> utilization
> > >> > >> > >> > of
> > >> > >> > >> > > > the
> > >> > >> > >> > > > > > > > network
> > >> > >> > >> > > > > > > > > > > thread
> > >> > >> > >> > > > > > > > > > > > pool too. The difficult is mostly what
> > >> Rajini
> > >> > >> > said:
> > >> > >> > >> (1)
> > >> > >> > >> > > The
> > >> > >> > >> > > > > > > > mechanism
> > >> > >> > >> > > > > > > > > > for
> > >> > >> > >> > > > > > > > > > > > throttling the requests is through
> > >> Purgatory
> > >> > >> and
> > >> > >> > we
> > >> > >> > >> > will
> > >> > >> > >> > > > have
> > >> > >> > >> > > > > > to
> > >> > >> > >> > > > > > > > > think
> > >> > >> > >> > > > > > > > > > > > through how to integrate that into the
> > >> > network
> > >> > >> > >> layer.
> > >> > >> > >> > > (2)
> > >> > >> > >> > > > In
> > >> > >> > >> > > > > > the
> > >> > >> > >> > > > > > > > > > network
> > >> > >> > >> > > > > > > > > > > > layer, currently we know the user, but
> > not
> > >> > the
> > >> > >> > >> clientId
> > >> > >> > >> > > of
> > >> > >> > >> > > > > the
> > >> > >> > >> > > > > > > > > request.
> > >> > >> > >> > > > > > > > > > > So,
> > >> > >> > >> > > > > > > > > > > > it's a bit tricky to throttle based on
> > >> > clientId
> > >> > >> > >> there.
> > >> > >> > >> > > > Plus,
> > >> > >> > >> > > > > > the
> > >> > >> > >> > > > > > > > > > byteOut
> > >> > >> > >> > > > > > > > > > > > quota can already protect the network
> > >> thread
> > >> > >> > >> > utilization
> > >> > >> > >> > > > for
> > >> > >> > >> > > > > > > fetch
> > >> > >> > >> > > > > > > > > > > > requests. So, if we can't figure out
> > this
> > >> > part
> > >> > >> > right
> > >> > >> > >> > now,
> > >> > >> > >> > > > > just
> > >> > >> > >> > > > > > > > > focusing
> > >> > >> > >> > > > > > > > > > > on
> > >> > >> > >> > > > > > > > > > > > the request handling threads for this
> > KIP
> > >> is
> > >> > >> > still a
> > >> > >> > >> > > useful
> > >> > >> > >> > > > > > > > feature.
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > Thanks,
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > Jun
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM,
> Rajini
> > >> > >> Sivaram <
> > >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > wrote:
> > >> > >> > >> > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > Thank you all for the feedback.
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > Jay: I have removed exemption for
> > >> consumer
> > >> > >> > >> heartbeat
> > >> > >> > >> > > etc.
> > >> > >> > >> > > > > > Agree
> > >> > >> > >> > > > > > > > > that
> > >> > >> > >> > > > > > > > > > > > > protecting the cluster is more
> > important
> > >> > than
> > >> > >> > >> > > protecting
> > >> > >> > >> > > > > > > > individual
> > >> > >> > >> > > > > > > > > > > apps.
> > >> > >> > >> > > > > > > > > > > > > Have retained the exemption for
> > >> > >> > >> > > StopReplicat/LeaderAndIsr
> > >> > >> > >> > > > > > etc,
> > >> > >> > >> > > > > > > > > these
> > >> > >> > >> > > > > > > > > > > are
> > >> > >> > >> > > > > > > > > > > > > throttled only if authorization
> fails
> > >> (so
> > >> > >> can't
> > >> > >> > be
> > >> > >> > >> > used
> > >> > >> > >> > > > for
> > >> > >> > >> > > > > > DoS
> > >> > >> > >> > > > > > > > > > attacks
> > >> > >> > >> > > > > > > > > > > > in
> > >> > >> > >> > > > > > > > > > > > > a secure cluster, but allows
> > >> inter-broker
> > >> > >> > >> requests to
> > >> > >> > >> > > > > > complete
> > >> > >> > >> > > > > > > > > > without
> > >> > >> > >> > > > > > > > > > > > > delays).
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > I will wait another day to see if
> > these
> > >> is
> > >> > >> any
> > >> > >> > >> > > objection
> > >> > >> > >> > > > to
> > >> > >> > >> > > > > > > > quotas
> > >> > >> > >> > > > > > > > > > > based
> > >> > >> > >> > > > > > > > > > > > on
> > >> > >> > >> > > > > > > > > > > > > request processing time (as opposed
> to
> > >> > >> request
> > >> > >> > >> rate)
> > >> > >> > >> > > and
> > >> > >> > >> > > > if
> > >> > >> > >> > > > > > > there
> > >> > >> > >> > > > > > > > > are
> > >> > >> > >> > > > > > > > > > > no
> > >> > >> > >> > > > > > > > > > > > > objections, I will revert to the
> > >> original
> > >> > >> > proposal
> > >> > >> > >> > with
> > >> > >> > >> > > > > some
> > >> > >> > >> > > > > > > > > changes.
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > The original proposal was only
> > including
> > >> > the
> > >> > >> > time
> > >> > >> > >> > used
> > >> > >> > >> > > by
> > >> > >> > >> > > > > the
> > >> > >> > >> > > > > > > > > request
> > >> > >> > >> > > > > > > > > > > > > handler threads (that made
> calculation
> > >> > >> easy). I
> > >> > >> > >> think
> > >> > >> > >> > > the
> > >> > >> > >> > > > > > > > > suggestion
> > >> > >> > >> > > > > > > > > > is
> > >> > >> > >> > > > > > > > > > > > to
> > >> > >> > >> > > > > > > > > > > > > include the time spent in the
> network
> > >> > >> threads as
> > >> > >> > >> well
> > >> > >> > >> > > > since
> > >> > >> > >> > > > > > > that
> > >> > >> > >> > > > > > > > > may
> > >> > >> > >> > > > > > > > > > be
> > >> > >> > >> > > > > > > > > > > > > significant. As Jay pointed out, it
> is
> > >> more
> > >> > >> > >> > complicated
> > >> > >> > >> > > > to
> > >> > >> > >> > > > > > > > > calculate
> > >> > >> > >> > > > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > > > total available CPU time and convert
> > to
> > >> a
> > >> > >> ratio
> > >> > >> > >> when
> > >> > >> > >> > > > there
> > >> > >> > >> > > > > > *m*
> > >> > >> > >> > > > > > > > I/O
> > >> > >> > >> > > > > > > > > > > > threads
> > >> > >> > >> > > > > > > > > > > > > and *n* network threads.
> > >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > >> > >> > >> > > )
> > >> > >> > >> > > > > may
> > >> > >> > >> > > > > > > > give
> > >> > >> > >> > > > > > > > > us
> > >> > >> > >> > > > > > > > > > > > what
> > >> > >> > >> > > > > > > > > > > > > we want, but it can be very
> expensive
> > on
> > >> > some
> > >> > >> > >> > > platforms.
> > >> > >> > >> > > > As
> > >> > >> > >> > > > > > > > Becket
> > >> > >> > >> > > > > > > > > > and
> > >> > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do
> have
> > >> > several
> > >> > >> > time
> > >> > >> > >> > > > > > measurements
> > >> > >> > >> > > > > > > > > > already
> > >> > >> > >> > > > > > > > > > > > for
> > >> > >> > >> > > > > > > > > > > > > generating metrics that we could
> use,
> > >> > though
> > >> > >> we
> > >> > >> > >> might
> > >> > >> > >> > > > want
> > >> > >> > >> > > > > to
> > >> > >> > >> > > > > > > > > switch
> > >> > >> > >> > > > > > > > > > to
> > >> > >> > >> > > > > > > > > > > > > nanoTime() instead of
> > >> currentTimeMillis()
> > >> > >> since
> > >> > >> > >> some
> > >> > >> > >> > of
> > >> > >> > >> > > > the
> > >> > >> > >> > > > > > > > values
> > >> > >> > >> > > > > > > > > > for
> > >> > >> > >> > > > > > > > > > > > > small requests may be < 1ms. But
> > rather
> > >> > than
> > >> > >> add
> > >> > >> > >> up
> > >> > >> > >> > the
> > >> > >> > >> > > > > time
> > >> > >> > >> > > > > > > > spent
> > >> > >> > >> > > > > > > > > in
> > >> > >> > >> > > > > > > > > > > I/O
> > >> > >> > >> > > > > > > > > > > > > thread and network thread, wouldn't
> it
> > >> be
> > >> > >> better
> > >> > >> > >> to
> > >> > >> > >> > > > convert
> > >> > >> > >> > > > > > the
> > >> > >> > >> > > > > > > > > time
> > >> > >> > >> > > > > > > > > > > > spent
> > >> > >> > >> > > > > > > > > > > > > on each thread into a separate
> ratio?
> > >> UserA
> > >> > >> has
> > >> > >> > a
> > >> > >> > >> > > request
> > >> > >> > >> > > > > > quota
> > >> > >> > >> > > > > > > > of
> > >> > >> > >> > > > > > > > > > 5%.
> > >> > >> > >> > > > > > > > > > > > Can
> > >> > >> > >> > > > > > > > > > > > > we take that to mean that UserA can
> > use
> > >> 5%
> > >> > of
> > >> > >> > the
> > >> > >> > >> > time
> > >> > >> > >> > > on
> > >> > >> > >> > > > > > > network
> > >> > >> > >> > > > > > > > > > > threads
> > >> > >> > >> > > > > > > > > > > > > and 5% of the time on I/O threads?
> If
> > >> > either
> > >> > >> is
> > >> > >> > >> > > exceeded,
> > >> > >> > >> > > > > the
> > >> > >> > >> > > > > > > > > > response
> > >> > >> > >> > > > > > > > > > > is
> > >> > >> > >> > > > > > > > > > > > > throttled - it would mean
> maintaining
> > >> two
> > >> > >> sets
> > >> > >> > of
> > >> > >> > >> > > metrics
> > >> > >> > >> > > > > for
> > >> > >> > >> > > > > > > the
> > >> > >> > >> > > > > > > > > two
> > >> > >> > >> > > > > > > > > > > > > durations, but would result in more
> > >> > >> meaningful
> > >> > >> > >> > ratios.
> > >> > >> > >> > > We
> > >> > >> > >> > > > > > could
> > >> > >> > >> > > > > > > > > > define
> > >> > >> > >> > > > > > > > > > > > two
> > >> > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of
> request
> > >> > threads
> > >> > >> > and
> > >> > >> > >> 10%
> > >> > >> > >> > > of
> > >> > >> > >> > > > > > > network
> > >> > >> > >> > > > > > > > > > > > threads),
> > >> > >> > >> > > > > > > > > > > > > but that seems unnecessary and
> harder
> > to
> > >> > >> explain
> > >> > >> > >> to
> > >> > >> > >> > > > users.
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > Back to why and how quotas are
> applied
> > >> to
> > >> > >> > network
> > >> > >> > >> > > thread
> > >> > >> > >> > > > > > > > > utilization:
> > >> > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the time
> > >> spent in
> > >> > >> the
> > >> > >> > >> > network
> > >> > >> > >> > > > > > thread
> > >> > >> > >> > > > > > > > may
> > >> > >> > >> > > > > > > > > be
> > >> > >> > >> > > > > > > > > > > > > significant and I can see the need
> to
> > >> > include
> > >> > >> > >> this.
> > >> > >> > >> > Are
> > >> > >> > >> > > > > there
> > >> > >> > >> > > > > > > > other
> > >> > >> > >> > > > > > > > > > > > > requests where the network thread
> > >> > >> utilization is
> > >> > >> > >> > > > > significant?
> > >> > >> > >> > > > > > > In
> > >> > >> > >> > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > case
> > >> > >> > >> > > > > > > > > > > > > of fetch, request handler thread
> > >> > utilization
> > >> > >> > would
> > >> > >> > >> > > > throttle
> > >> > >> > >> > > > > > > > clients
> > >> > >> > >> > > > > > > > > > > with
> > >> > >> > >> > > > > > > > > > > > > high request rate, low data volume
> and
> > >> > fetch
> > >> > >> > byte
> > >> > >> > >> > rate
> > >> > >> > >> > > > > quota
> > >> > >> > >> > > > > > > will
> > >> > >> > >> > > > > > > > > > > > throttle
> > >> > >> > >> > > > > > > > > > > > > clients with high data volume.
> Network
> > >> > thread
> > >> > >> > >> > > utilization
> > >> > >> > >> > > > > is
> > >> > >> > >> > > > > > > > > perhaps
> > >> > >> > >> > > > > > > > > > > > > proportional to the data volume. I
> am
> > >> > >> wondering
> > >> > >> > >> if we
> > >> > >> > >> > > > even
> > >> > >> > >> > > > > > need
> > >> > >> > >> > > > > > > > to
> > >> > >> > >> > > > > > > > > > > > throttle
> > >> > >> > >> > > > > > > > > > > > > based on network thread utilization
> or
> > >> > >> whether
> > >> > >> > the
> > >> > >> > >> > data
> > >> > >> > >> > > > > > volume
> > >> > >> > >> > > > > > > > > quota
> > >> > >> > >> > > > > > > > > > > > covers
> > >> > >> > >> > > > > > > > > > > > > this case.
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > b) At the moment, we record and
> check
> > >> for
> > >> > >> quota
> > >> > >> > >> > > violation
> > >> > >> > >> > > > > at
> > >> > >> > >> > > > > > > the
> > >> > >> > >> > > > > > > > > same
> > >> > >> > >> > > > > > > > > > > > time.
> > >> > >> > >> > > > > > > > > > > > > If a quota is violated, the response
> > is
> > >> > >> delayed.
> > >> > >> > >> > Using
> > >> > >> > >> > > > > Jay'e
> > >> > >> > >> > > > > > > > > example
> > >> > >> > >> > > > > > > > > > of
> > >> > >> > >> > > > > > > > > > > > > disk reads for fetches happening in
> > the
> > >> > >> network
> > >> > >> > >> > thread,
> > >> > >> > >> > > > We
> > >> > >> > >> > > > > > > can't
> > >> > >> > >> > > > > > > > > > record
> > >> > >> > >> > > > > > > > > > > > and
> > >> > >> > >> > > > > > > > > > > > > delay a response after the disk
> reads.
> > >> We
> > >> > >> could
> > >> > >> > >> > record
> > >> > >> > >> > > > the
> > >> > >> > >> > > > > > time
> > >> > >> > >> > > > > > > > > spent
> > >> > >> > >> > > > > > > > > > > on
> > >> > >> > >> > > > > > > > > > > > > the network thread when the response
> > is
> > >> > >> complete
> > >> > >> > >> and
> > >> > >> > >> > > > > > introduce
> > >> > >> > >> > > > > > > a
> > >> > >> > >> > > > > > > > > > delay
> > >> > >> > >> > > > > > > > > > > > for
> > >> > >> > >> > > > > > > > > > > > > handling a subsequent request
> > (separate
> > >> out
> > >> > >> > >> recording
> > >> > >> > >> > > and
> > >> > >> > >> > > > > > quota
> > >> > >> > >> > > > > > > > > > > violation
> > >> > >> > >> > > > > > > > > > > > > handling in the case of network
> thread
> > >> > >> > overload).
> > >> > >> > >> > Does
> > >> > >> > >> > > > that
> > >> > >> > >> > > > > > > make
> > >> > >> > >> > > > > > > > > > sense?
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > Regards,
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > Rajini
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM,
> > Becket
> > >> > Qin <
> > >> > >> > >> > > > > > > > becket.qin@gmail.com>
> > >> > >> > >> > > > > > > > > > > > wrote:
> > >> > >> > >> > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > >> > >> > >> > > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the
> CPU
> > >> time
> > >> > >> is a
> > >> > >> > >> > little
> > >> > >> > >> > > > > > > tricky. I
> > >> > >> > >> > > > > > > > > am
> > >> > >> > >> > > > > > > > > > > > > thinking
> > >> > >> > >> > > > > > > > > > > > > > that maybe we can use the existing
> > >> > request
> > >> > >> > >> > > statistics.
> > >> > >> > >> > > > > They
> > >> > >> > >> > > > > > > are
> > >> > >> > >> > > > > > > > > > > already
> > >> > >> > >> > > > > > > > > > > > > > very detailed so we can probably
> see
> > >> the
> > >> > >> > >> > approximate
> > >> > >> > >> > > > CPU
> > >> > >> > >> > > > > > time
> > >> > >> > >> > > > > > > > > from
> > >> > >> > >> > > > > > > > > > > it,
> > >> > >> > >> > > > > > > > > > > > > e.g.
> > >> > >> > >> > > > > > > > > > > > > > something like (total_time -
> > >> > >> > >> > > > request/response_queue_time
> > >> > >> > >> > > > > -
> > >> > >> > >> > > > > > > > > > > > remote_time).
> > >> > >> > >> > > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > > I agree with Guozhang that when a
> > >> user is
> > >> > >> > >> throttled
> > >> > >> > >> > > it
> > >> > >> > >> > > > is
> > >> > >> > >> > > > > > > > likely
> > >> > >> > >> > > > > > > > > > that
> > >> > >> > >> > > > > > > > > > > > we
> > >> > >> > >> > > > > > > > > > > > > > need to see if anything has went
> > wrong
> > >> > >> first,
> > >> > >> > >> and
> > >> > >> > >> > if
> > >> > >> > >> > > > the
> > >> > >> > >> > > > > > > users
> > >> > >> > >> > > > > > > > > are
> > >> > >> > >> > > > > > > > > > > well
> > >> > >> > >> > > > > > > > > > > > > > behaving and just need more
> > >> resources, we
> > >> > >> will
> > >> > >> > >> have
> > >> > >> > >> > > to
> > >> > >> > >> > > > > bump
> > >> > >> > >> > > > > > > up
> > >> > >> > >> > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > > quota
> > >> > >> > >> > > > > > > > > > > > > > for them. It is true that
> > >> pre-allocating
> > >> > >> CPU
> > >> > >> > >> time
> > >> > >> > >> > > quota
> > >> > >> > >> > > > > > > > precisely
> > >> > >> > >> > > > > > > > > > for
> > >> > >> > >> > > > > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > > > > users is difficult. So in practice
> > it
> > >> > would
> > >> > >> > >> > probably
> > >> > >> > >> > > be
> > >> > >> > >> > > > > > more
> > >> > >> > >> > > > > > > > like
> > >> > >> > >> > > > > > > > > > > first
> > >> > >> > >> > > > > > > > > > > > > set
> > >> > >> > >> > > > > > > > > > > > > > a relative high protective CPU
> time
> > >> quota
> > >> > >> for
> > >> > >> > >> > > everyone
> > >> > >> > >> > > > > and
> > >> > >> > >> > > > > > > > > increase
> > >> > >> > >> > > > > > > > > > > > that
> > >> > >> > >> > > > > > > > > > > > > > for some individual clients on
> > demand.
> > >> > >> > >> > > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > > Thanks,
> > >> > >> > >> > > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > >> > >> > >> > > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM,
> > >> Guozhang
> > >> > >> > Wang <
> > >> > >> > >> > > > > > > > > wangguoz@gmail.com
> > >> > >> > >> > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > wrote:
> > >> > >> > >> > > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > > > This is a great proposal, glad
> to
> > >> see
> > >> > it
> > >> > >> > >> > happening.
> > >> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU
> > >> throttling, or
> > >> > >> more
> > >> > >> > >> > > > > specifically
> > >> > >> > >> > > > > > > > > > > processing
> > >> > >> > >> > > > > > > > > > > > > time
> > >> > >> > >> > > > > > > > > > > > > > > ratio instead of the request
> rate
> > >> > >> throttling
> > >> > >> > >> as
> > >> > >> > >> > > well.
> > >> > >> > >> > > > > > > Becket
> > >> > >> > >> > > > > > > > > has
> > >> > >> > >> > > > > > > > > > > very
> > >> > >> > >> > > > > > > > > > > > > > well
> > >> > >> > >> > > > > > > > > > > > > > > summed my rationales above, and
> > one
> > >> > >> thing to
> > >> > >> > >> add
> > >> > >> > >> > > here
> > >> > >> > >> > > > > is
> > >> > >> > >> > > > > > > that
> > >> > >> > >> > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > > > former
> > >> > >> > >> > > > > > > > > > > > > > > has a good support for both
> > >> "protecting
> > >> > >> > >> against
> > >> > >> > >> > > rogue
> > >> > >> > >> > > > > > > > clients"
> > >> > >> > >> > > > > > > > > as
> > >> > >> > >> > > > > > > > > > > > well
> > >> > >> > >> > > > > > > > > > > > > as
> > >> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> > >> multi-tenancy
> > >> > >> > usage":
> > >> > >> > >> > when
> > >> > >> > >> > > > > > > thinking
> > >> > >> > >> > > > > > > > > > about
> > >> > >> > >> > > > > > > > > > > > how
> > >> > >> > >> > > > > > > > > > > > > to
> > >> > >> > >> > > > > > > > > > > > > > > explain this to the end users, I
> > >> find
> > >> > it
> > >> > >> > >> actually
> > >> > >> > >> > > > more
> > >> > >> > >> > > > > > > > natural
> > >> > >> > >> > > > > > > > > > than
> > >> > >> > >> > > > > > > > > > > > the
> > >> > >> > >> > > > > > > > > > > > > > > request rate since as mentioned
> > >> above,
> > >> > >> > >> different
> > >> > >> > >> > > > > requests
> > >> > >> > >> > > > > > > > will
> > >> > >> > >> > > > > > > > > > have
> > >> > >> > >> > > > > > > > > > > > > quite
> > >> > >> > >> > > > > > > > > > > > > > > different "cost", and Kafka
> today
> > >> > already
> > >> > >> > have
> > >> > >> > >> > > > various
> > >> > >> > >> > > > > > > > request
> > >> > >> > >> > > > > > > > > > > types
> > >> > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin,
> metadata,
> > >> etc),
> > >> > >> > >> because
> > >> > >> > >> > of
> > >> > >> > >> > > > that
> > >> > >> > >> > > > > > the
> > >> > >> > >> > > > > > > > > > request
> > >> > >> > >> > > > > > > > > > > > > rate
> > >> > >> > >> > > > > > > > > > > > > > > throttling may not be as
> effective
> > >> > >> unless it
> > >> > >> > >> is
> > >> > >> > >> > set
> > >> > >> > >> > > > > very
> > >> > >> > >> > > > > > > > > > > > > conservatively.
> > >> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > >> > > > > > > > > > > > > > > Regarding to user reactions when
> > >> they
> > >> > are
> > >> > >> > >> > > throttled,
> > >> > >> > >> > > > I
> > >> > >> > >> > > > > > > think
> > >> > >> > >> > > > > > > > it
> > >> > >> > >> > > > > > > > > > may
> > >> > >> > >> > > > > > > > > > > > > > differ
> > >> > >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
> > >> > discovered /
> > >> > >> > >> guided
> > >> > >> > >> > by
> > >> > >> > >> > > > > > looking
> > >> > >> > >> > > > > > > > at
> > >> > >> > >> > > > > > > > > > > > relative
> > >> > >> > >> > > > > > > > > > > > > > > metrics. So in other words users
> > >> would
> > >> > >> not
> > >> > >> > >> expect
> > >> > >> > >> > > to
> > >> > >> > >> > > > > get
> > >> > >> > >> > > > > > > > > > additional
> > >> > >> > >> > > > > > > > > > > > > > > information by simply being told
> > >> "hey,
> > >> > >> you
> > >> > >> > are
> > >> > >> > >> > > > > > throttled",
> > >> > >> > >> > > > > > > > > which
> > >> > >> > >> > > > > > > > > > is
> > >> > >> > >> > > > > > > > > > > > all
> > >> > >> > >> > > > > > > > > > > > > > > what throttling does; they need
> to
> > >> > take a
> > >> > >> > >> > follow-up
> > >> > >> > >> > > > > step
> > >> > >> > >> > > > > > > and
> > >> > >> > >> > > > > > > > > see
> > >> > >> > >> > > > > > > > > > > > "hmm,
> > >> > >> > >> > > > > > > > > > > > > > I'm
> > >> > >> > >> > > > > > > > > > > > > > > throttled probably because of
> ..",
> > >> > which
> > >> > >> is
> > >> > >> > by
> > >> > >> > >> > > > looking
> > >> > >> > >> > > > > at
> > >> > >> > >> > > > > > > > other
> > >> > >> > >> > > > > > > > > > > > metric
> > >> > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm
> > bombarding
> > >> the
> > >> > >> > >> brokers
> > >> > >> > >> > > with
> > >> > >> > >> > > > >
> > >>
> > > ...
> > >
> > > [Message clipped]
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Jun,

If we use request.percentage as the percentage used in a single I/O thread,
the total percentage being allocated will be num.io.threads * 100 for I/O
threads and num.network.threads * 100 for network threads. A single quota
covering the two as a percentage wouldn't quite work if you want to
allocate the same proportion in both cases. If we want to treat threads as
separate units, won't we need two quota configurations regardless of
whether we use units or percentage? Perhaps I misunderstood your suggestion.

I think there are two cases:

   1. The use case that you mentioned where an admin is adding more users
   and decides to add more I/O threads and expects to find free quota to
   allocate for new users.
   2. Admin adds more I/O threads because the I/O threads are saturated and
   there are cores available to allocate, even though the number or
   users/clients hasn't changed.

If we allocated treated I/O threads as a single unit of 100%, all user
quotas need to be reallocated for 1). If we allocated I/O threads as n
units with n*100%, all user quotas need to be reallocated for 2), otherwise
some of the new threads may just not be used. Either way it should be easy
to write a script to decrease/increase quotas by a multiple for all users.

So it really boils down to which quota unit is most intuitive in terms of
configuration. And from the discussion so far, it feels like opinion is
divided on whether quotas should be carved out of an absolute 100% (or 1
unit) or be relative to the number of threads (n*100% or n units).



On Thu, Mar 2, 2017 at 7:31 PM, Jun Rao <ju...@confluent.io> wrote:

> Another way to express an absolute limit is to use request.percentage, but
> treat it as the percentage used in a single request handling thread. For
> now, the request handling threads can be just the io threads. In the
> future, they can cover the network threads as well. This is similar to how
> top reports CPU usage and may be a bit easier for people to understand.
>
> Thanks,
>
> Jun
>
> On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Jay,
> >
> > 2. Regarding request.unit vs request.percentage. I started with
> > request.percentage too. The reasoning for request.unit is the following.
> > Suppose that the capacity has been reached on a broker and the admin
> needs
> > to add a new user. A simple way to increase the capacity is to increase
> the
> > number of io threads, assuming there are still enough cores. If the limit
> > is based on percentage, the additional capacity automatically gets
> > distributed to existing users and we haven't really carved out any
> > additional resource for the new user. Now, is it easy for a user to
> reason
> > about 0.1 unit vs 10%. My feeling is that both are hard and have to be
> > configured empirically. Not sure if percentage is obviously easier to
> > reason about.
> >
> > Thanks,
> >
> > Jun
> >
> > On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io> wrote:
> >
> >> A couple of quick points:
> >>
> >> 1. Even though the implementation of this quota is only using io thread
> >> time, i think we should call it something like "request-time". This will
> >> give us flexibility to improve the implementation to cover network
> threads
> >> in the future and will avoid exposing internal details like our thread
> >> pools on the server.
> >>
> >> 2. Jun/Roger, I get what you are trying to fix but the idea of
> >> thread/units
> >> is super unintuitive as a user-facing knob. I had to read the KIP like
> >> eight times to understand this. I'm not sure that your point that
> >> increasing the number of threads is a problem with a percentage-based
> >> value, it really depends on whether the user thinks about the
> "percentage
> >> of request processing time" or "thread units". If they think "I have
> >> allocated 10% of my request processing time to user x" then it is a bug
> >> that increasing the thread count decreases that percent as it does in
> the
> >> current proposal. As a practical matter I think the only way to actually
> >> reason about this is as a percent---I just don't believe people are
> going
> >> to think, "ah, 4.3 thread units, that is the right amount!". Instead I
> >> think they have to understand this thread unit concept, figure out what
> >> they have set in number of threads, compute a percent and then come up
> >> with
> >> the number of thread units, and these will all be wrong if that thread
> >> count changes. I also think this ties us to throttling the I/O thread
> >> pool,
> >> which may not be where we want to end up.
> >>
> >> 3. For what it's worth I do think having a single throttle_ms field in
> all
> >> the responses that combines all throttling from all quotas is probably
> the
> >> simplest. There could be a use case for having separate fields for each,
> >> but I think that is actually harder to use/monitor in the common case so
> >> unless someone has a use case I think just one should be fine.
> >>
> >> -Jay
> >>
> >> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <
> rajinisivaram@gmail.com>
> >> wrote:
> >>
> >> > I have updated the KIP based on the discussions so far.
> >> >
> >> >
> >> > Regards,
> >> >
> >> > Rajini
> >> >
> >> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> >> rajinisivaram@gmail.com>
> >> > wrote:
> >> >
> >> > > Thank you all for the feedback.
> >> > >
> >> > > Ismael #1. It makes sense not to throttle inter-broker requests like
> >> > > LeaderAndIsr etc. The simplest way to ensure that clients cannot use
> >> > these
> >> > > requests to bypass quotas for DoS attacks is to ensure that ACLs
> >> prevent
> >> > > clients from using these requests and unauthorized requests are
> >> included
> >> > > towards quotas.
> >> > >
> >> > > Ismael #2, Jay #1 : I was thinking that these quotas can return a
> >> > separate
> >> > > throttle time, and all utilization based quotas could use the same
> >> field
> >> > > (we won't add another one for network thread utilization for
> >> instance).
> >> > But
> >> > > perhaps it makes sense to keep byte rate quotas separate in
> >> produce/fetch
> >> > > responses to provide separate metrics? Agree with Ismael that the
> >> name of
> >> > > the existing field should be changed if we have two. Happy to switch
> >> to a
> >> > > single combined throttle time if that is sufficient.
> >> > >
> >> > > Ismael #4, #5, #6: Will update KIP. Will use dot separated name for
> >> new
> >> > > property. Replication quotas use dot separated, so it will be
> >> consistent
> >> > > with all properties except byte rate quotas.
> >> > >
> >> > > Radai: #1 Request processing time rather than request rate were
> chosen
> >> > > because the time per request can vary significantly between requests
> >> as
> >> > > mentioned in the discussion and KIP.
> >> > > #2 Two separate quotas for heartbeats/regular requests feel like
> more
> >> > > configuration and more metrics. Since most users would set quotas
> >> higher
> >> > > than the expected usage and quotas are more of a safety net, a
> single
> >> > quota
> >> > > should work in most cases.
> >> > >  #3 The number of requests in purgatory is limited by the number of
> >> > active
> >> > > connections since only one request per connection will be throttled
> >> at a
> >> > > time.
> >> > > #4 As with byte rate quotas, to use the full allocated quotas,
> >> > > clients/users would need to use partitions that are distributed
> across
> >> > the
> >> > > cluster. The alternative of using cluster-wide quotas instead of
> >> > per-broker
> >> > > quotas would be far too complex to implement.
> >> > >
> >> > > Dong : We currently have two ClientQuotaManagers for quota types
> Fetch
> >> > and
> >> > > Produce. A new one will be added for IOThread, which manages quotas
> >> for
> >> > I/O
> >> > > thread utilization. This will not update the Fetch or Produce
> >> queue-size,
> >> > > but will have a separate metric for the queue-size.  I wasn't
> >> planning to
> >> > > add any additional metrics apart from the equivalent ones for
> existing
> >> > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
> >> utilization
> >> > > could be slightly misleading since it depends on the sequence of
> >> > requests.
> >> > > But we can look into more metrics after the KIP is implemented if
> >> > required.
> >> > >
> >> > > I think we need to limit the maximum delay since all requests are
> >> > > throttled. If a client has a quota of 0.001 units and a single
> request
> >> > used
> >> > > 50ms, we don't want to delay all requests from the client by 50
> >> seconds,
> >> > > throwing the client out of all its consumer groups. The issue is
> only
> >> if
> >> > a
> >> > > user is allocated a quota that is insufficient to process one large
> >> > > request. The expectation is that the units allocated per user will
> be
> >> > much
> >> > > higher than the time taken to process one request and the limit
> should
> >> > > seldom be applied. Agree this needs proper documentation.
> >> > >
> >> > > Regards,
> >> > >
> >> > > Rajini
> >> > >
> >> > >
> >> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <ra...@gmail.com>
> >> > wrote:
> >> > >
> >> > >> @jun: i wasnt concerned about tying up a request processing thread,
> >> but
> >> > >> IIUC the code does still read the entire request out, which might
> >> add-up
> >> > >> to
> >> > >> a non-negligible amount of memory.
> >> > >>
> >> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com>
> >> wrote:
> >> > >>
> >> > >> > Hey Rajini,
> >> > >> >
> >> > >> > The current KIP says that the maximum delay will be reduced to
> >> window
> >> > >> size
> >> > >> > if it is larger than the window size. I have a concern with this:
> >> > >> >
> >> > >> > 1) This essentially means that the user is allowed to exceed
> their
> >> > quota
> >> > >> > over a long period of time. Can you provide an upper bound on
> this
> >> > >> > deviation?
> >> > >> >
> >> > >> > 2) What is the motivation for cap the maximum delay by the window
> >> > size?
> >> > >> I
> >> > >> > am wondering if there is better alternative to address the
> problem.
> >> > >> >
> >> > >> > 3) It means that the existing metric-related config will have a
> >> more
> >> > >> > directly impact on the mechanism of this io-thread-unit-based
> >> quota.
> >> > The
> >> > >> > may be an important change depending on the answer to 1) above.
> We
> >> > >> probably
> >> > >> > need to document this more explicitly.
> >> > >> >
> >> > >> > Dong
> >> > >> >
> >> > >> >
> >> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <li...@gmail.com>
> >> > wrote:
> >> > >> >
> >> > >> > > Hey Jun,
> >> > >> > >
> >> > >> > > Yeah you are right. I thought it wasn't because at LinkedIn it
> >> will
> >> > be
> >> > >> > too
> >> > >> > > much pressure on inGraph to expose those per-clientId metrics
> so
> >> we
> >> > >> ended
> >> > >> > > up printing them periodically to local log. Never mind if it is
> >> not
> >> > a
> >> > >> > > general problem.
> >> > >> > >
> >> > >> > > Hey Rajini,
> >> > >> > >
> >> > >> > > - I agree with Jay that we probably don't want to add a new
> field
> >> > for
> >> > >> > > every quota ProduceResponse or FetchResponse. Is there any
> >> use-case
> >> > >> for
> >> > >> > > having separate throttle-time fields for byte-rate-quota and
> >> > >> > > io-thread-unit-quota? You probably need to document this as
> >> > interface
> >> > >> > > change if you plan to add new field in any request.
> >> > >> > >
> >> > >> > > - I don't think IOThread belongs to quotaType. The existing
> quota
> >> > >> types
> >> > >> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication)
> >> identify
> >> > >> the
> >> > >> > > type of request that are throttled, not the quota mechanism
> that
> >> is
> >> > >> > applied.
> >> > >> > >
> >> > >> > > - If a request is throttled due to this io-thread-unit-based
> >> quota,
> >> > is
> >> > >> > the
> >> > >> > > existing queue-size metric in ClientQuotaManager incremented?
> >> > >> > >
> >> > >> > > - In the interest of providing guide line for admin to decide
> >> > >> > > io-thread-unit-based quota and for user to understand its
> impact
> >> on
> >> > >> their
> >> > >> > > traffic, would it be useful to have a metric that shows the
> >> overall
> >> > >> > > byte-rate per io-thread-unit? Can we also show this a
> >> per-clientId
> >> > >> > metric?
> >> > >> > >
> >> > >> > > Thanks,
> >> > >> > > Dong
> >> > >> > >
> >> > >> > >
> >> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io>
> >> wrote:
> >> > >> > >
> >> > >> > >> Hi, Ismael,
> >> > >> > >>
> >> > >> > >> For #3, typically, an admin won't configure more io threads
> than
> >> > CPU
> >> > >> > >> cores,
> >> > >> > >> but it's possible for an admin to start with fewer io threads
> >> than
> >> > >> cores
> >> > >> > >> and grow that later on.
> >> > >> > >>
> >> > >> > >> Hi, Dong,
> >> > >> > >>
> >> > >> > >> I think the throttleTime sensor on the broker tells the admin
> >> > >> whether a
> >> > >> > >> user/clentId is throttled or not.
> >> > >> > >>
> >> > >> > >> Hi, Radi,
> >> > >> > >>
> >> > >> > >> The reasoning for delaying the throttled requests on the
> broker
> >> > >> instead
> >> > >> > of
> >> > >> > >> returning an error immediately is that the latter has no way
> to
> >> > >> prevent
> >> > >> > >> the
> >> > >> > >> client from retrying immediately, which will make things
> worse.
> >> The
> >> > >> > >> delaying logic is based off a delay queue. A separate
> expiration
> >> > >> thread
> >> > >> > >> just waits on the next to be expired request. So, it doesn't
> tie
> >> > up a
> >> > >> > >> request handler thread.
> >> > >> > >>
> >> > >> > >> Thanks,
> >> > >> > >>
> >> > >> > >> Jun
> >> > >> > >>
> >> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <
> ismael@juma.me.uk
> >> >
> >> > >> wrote:
> >> > >> > >>
> >> > >> > >> > Hi Jay,
> >> > >> > >> >
> >> > >> > >> > Regarding 1, I definitely like the simplicity of keeping a
> >> single
> >> > >> > >> throttle
> >> > >> > >> > time field in the response. The downside is that the client
> >> > metrics
> >> > >> > >> will be
> >> > >> > >> > more coarse grained.
> >> > >> > >> >
> >> > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.
> percentage`
> >> > and
> >> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> >> > >> > >> >
> >> > >> > >> > Ismael
> >> > >> > >> >
> >> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <
> jay@confluent.io>
> >> > >> wrote:
> >> > >> > >> >
> >> > >> > >> > > A few minor comments:
> >> > >> > >> > >
> >> > >> > >> > >    1. Isn't it the case that the throttling time response
> >> field
> >> > >> > should
> >> > >> > >> > have
> >> > >> > >> > >    the total time your request was throttled irrespective
> of
> >> > the
> >> > >> > >> quotas
> >> > >> > >> > > that
> >> > >> > >> > >    caused that. Limiting it to byte rate quota doesn't
> make
> >> > >> sense,
> >> > >> > >> but I
> >> > >> > >> > > also
> >> > >> > >> > >    I don't think we want to end up adding new fields in
> the
> >> > >> response
> >> > >> > >> for
> >> > >> > >> > > every
> >> > >> > >> > >    single thing we quota, right?
> >> > >> > >> > >    2. I don't think we should make this quota specifically
> >> > about
> >> > >> io
> >> > >> > >> > >    threads. Once we introduce these quotas people set them
> >> and
> >> > >> > expect
> >> > >> > >> > them
> >> > >> > >> > > to
> >> > >> > >> > >    be enforced (and if they aren't it may cause an
> outage).
> >> As
> >> > a
> >> > >> > >> result
> >> > >> > >> > > they
> >> > >> > >> > >    are a bit more sensitive than normal configs, I think.
> >> The
> >> > >> > current
> >> > >> > >> > > thread
> >> > >> > >> > >    pools seem like something of an implementation detail
> and
> >> > not
> >> > >> the
> >> > >> > >> > level
> >> > >> > >> > > the
> >> > >> > >> > >    user-facing quotas should be involved with. I think it
> >> might
> >> > >> be
> >> > >> > >> better
> >> > >> > >> > > to
> >> > >> > >> > >    make this a general request-time throttle with no
> >> mention in
> >> > >> the
> >> > >> > >> > naming
> >> > >> > >> > >    about I/O threads and simply acknowledge the current
> >> > >> limitation
> >> > >> > >> (which
> >> > >> > >> > > we
> >> > >> > >> > >    may someday fix) in the docs that this covers only the
> >> time
> >> > >> after
> >> > >> > >> the
> >> > >> > >> > >    thread is read off the network.
> >> > >> > >> > >    3. As such I think the right interface to the user
> would
> >> be
> >> > >> > >> something
> >> > >> > >> > >    like percent_request_time and be in {0,...100} or
> >> > >> > >> request_time_ratio
> >> > >> > >> > > and be
> >> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we
> >> used
> >> > >> if
> >> > >> > the
> >> > >> > >> > > scale
> >> > >> > >> > >    is between 0 and 1 in the other metrics, right?)
> >> > >> > >> > >
> >> > >> > >> > > -Jay
> >> > >> > >> > >
> >> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> >> > >> > >> rajinisivaram@gmail.com
> >> > >> > >> > >
> >> > >> > >> > > wrote:
> >> > >> > >> > >
> >> > >> > >> > > > Guozhang/Dong,
> >> > >> > >> > > >
> >> > >> > >> > > > Thank you for the feedback.
> >> > >> > >> > > >
> >> > >> > >> > > > Guozhang : I have updated the section on co-existence of
> >> byte
> >> > >> rate
> >> > >> > >> and
> >> > >> > >> > > > request time quotas.
> >> > >> > >> > > >
> >> > >> > >> > > > Dong: I hadn't added much detail to the metrics and
> >> sensors
> >> > >> since
> >> > >> > >> they
> >> > >> > >> > > are
> >> > >> > >> > > > going to be very similar to the existing metrics and
> >> sensors.
> >> > >> To
> >> > >> > >> avoid
> >> > >> > >> > > > confusion, I have now added more detail. All metrics are
> >> in
> >> > the
> >> > >> > >> group
> >> > >> > >> > > > "quotaType" and all sensors have names starting with
> >> > >> "quotaType"
> >> > >> > >> (where
> >> > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> >> > >> > >> > > > FollowerReplication/*IOThread*).
> >> > >> > >> > > > So there will be no reuse of existing metrics/sensors.
> The
> >> > new
> >> > >> > ones
> >> > >> > >> for
> >> > >> > >> > > > request processing time based throttling will be
> >> completely
> >> > >> > >> independent
> >> > >> > >> > > of
> >> > >> > >> > > > existing metrics/sensors, but will be consistent in
> >> format.
> >> > >> > >> > > >
> >> > >> > >> > > > The existing throttle_time_ms field in produce/fetch
> >> > responses
> >> > >> > will
> >> > >> > >> not
> >> > >> > >> > > be
> >> > >> > >> > > > impacted by this KIP. That will continue to return
> >> byte-rate
> >> > >> based
> >> > >> > >> > > > throttling times. In addition, a new field
> >> > >> > request_throttle_time_ms
> >> > >> > >> > will
> >> > >> > >> > > be
> >> > >> > >> > > > added to return request quota based throttling times.
> >> These
> >> > >> will
> >> > >> > be
> >> > >> > >> > > exposed
> >> > >> > >> > > > as new metrics on the client-side.
> >> > >> > >> > > >
> >> > >> > >> > > > Since all metrics and sensors are different for each
> type
> >> of
> >> > >> > quota,
> >> > >> > >> I
> >> > >> > >> > > > believe there is already sufficient metrics to monitor
> >> > >> throttling
> >> > >> > on
> >> > >> > >> > both
> >> > >> > >> > > > client and broker side for each type of throttling.
> >> > >> > >> > > >
> >> > >> > >> > > > Regards,
> >> > >> > >> > > >
> >> > >> > >> > > > Rajini
> >> > >> > >> > > >
> >> > >> > >> > > >
> >> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> >> > lindong28@gmail.com
> >> > >> >
> >> > >> > >> wrote:
> >> > >> > >> > > >
> >> > >> > >> > > > > Hey Rajini,
> >> > >> > >> > > > >
> >> > >> > >> > > > > I think it makes a lot of sense to use io_thread_units
> >> as
> >> > >> metric
> >> > >> > >> to
> >> > >> > >> > > quota
> >> > >> > >> > > > > user's traffic here. LGTM overall. I have some
> questions
> >> > >> > regarding
> >> > >> > >> > > > sensors.
> >> > >> > >> > > > >
> >> > >> > >> > > > > - Can you be more specific in the KIP what sensors
> will
> >> be
> >> > >> > added?
> >> > >> > >> For
> >> > >> > >> > > > > example, it will be useful to specify the name and
> >> > >> attributes of
> >> > >> > >> > these
> >> > >> > >> > > > new
> >> > >> > >> > > > > sensors.
> >> > >> > >> > > > >
> >> > >> > >> > > > > - We currently have throttle-time and queue-size for
> >> > >> byte-rate
> >> > >> > >> based
> >> > >> > >> > > > quota.
> >> > >> > >> > > > > Are you going to have separate throttle-time and
> >> queue-size
> >> > >> for
> >> > >> > >> > > requests
> >> > >> > >> > > > > throttled by io_thread_unit-based quota, or will they
> >> share
> >> > >> the
> >> > >> > >> same
> >> > >> > >> > > > > sensor?
> >> > >> > >> > > > >
> >> > >> > >> > > > > - Does the throttle-time in the ProduceResponse and
> >> > >> > FetchResponse
> >> > >> > >> > > > contains
> >> > >> > >> > > > > time due to io_thread_unit-based quota?
> >> > >> > >> > > > >
> >> > >> > >> > > > > - Currently kafka server doesn't not provide any log
> or
> >> > >> metrics
> >> > >> > >> that
> >> > >> > >> > > > tells
> >> > >> > >> > > > > whether any given clientId (or user) is throttled.
> This
> >> is
> >> > >> not
> >> > >> > too
> >> > >> > >> > bad
> >> > >> > >> > > > > because we can still check the client-side byte-rate
> >> metric
> >> > >> to
> >> > >> > >> > validate
> >> > >> > >> > > > > whether a given client is throttled. But with this
> >> > >> > io_thread_unit,
> >> > >> > >> > > there
> >> > >> > >> > > > > will be no way to validate whether a given client is
> >> slow
> >> > >> > because
> >> > >> > >> it
> >> > >> > >> > > has
> >> > >> > >> > > > > exceeded its io_thread_unit limit. It is necessary for
> >> user
> >> > >> to
> >> > >> > be
> >> > >> > >> > able
> >> > >> > >> > > to
> >> > >> > >> > > > > know this information to figure how whether they have
> >> > reached
> >> > >> > >> there
> >> > >> > >> > > quota
> >> > >> > >> > > > > limit. How about we add log4j log on the server side
> to
> >> > >> > >> periodically
> >> > >> > >> > > > print
> >> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> >> > >> > >> > io-thread-unit-throttle-time)
> >> > >> > >> > > so
> >> > >> > >> > > > > that kafka administrator can figure those users that
> >> have
> >> > >> > reached
> >> > >> > >> > their
> >> > >> > >> > > > > limit and act accordingly?
> >> > >> > >> > > > >
> >> > >> > >> > > > > Thanks,
> >> > >> > >> > > > > Dong
> >> > >> > >> > > > >
> >> > >> > >> > > > >
> >> > >> > >> > > > >
> >> > >> > >> > > > >
> >> > >> > >> > > > >
> >> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> >> > >> > >> wangguoz@gmail.com>
> >> > >> > >> > > > wrote:
> >> > >> > >> > > > >
> >> > >> > >> > > > > > Made a pass over the doc, overall LGTM except a
> minor
> >> > >> comment
> >> > >> > on
> >> > >> > >> > the
> >> > >> > >> > > > > > throttling implementation:
> >> > >> > >> > > > > >
> >> > >> > >> > > > > > Stated as "Request processing time throttling will
> be
> >> > >> applied
> >> > >> > on
> >> > >> > >> > top
> >> > >> > >> > > if
> >> > >> > >> > > > > > necessary." I thought that it meant the request
> >> > processing
> >> > >> > time
> >> > >> > >> > > > > throttling
> >> > >> > >> > > > > > is applied first, but continue reading I found it
> >> > actually
> >> > >> > >> meant to
> >> > >> > >> > > > apply
> >> > >> > >> > > > > > produce / fetch byte rate throttling first.
> >> > >> > >> > > > > >
> >> > >> > >> > > > > > Also the last sentence "The remaining delay if any
> is
> >> > >> applied
> >> > >> > to
> >> > >> > >> > the
> >> > >> > >> > > > > > response." is a bit confusing to me. Maybe rewording
> >> it a
> >> > >> bit?
> >> > >> > >> > > > > >
> >> > >> > >> > > > > >
> >> > >> > >> > > > > > Guozhang
> >> > >> > >> > > > > >
> >> > >> > >> > > > > >
> >> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> >> > jun@confluent.io
> >> > >> >
> >> > >> > >> wrote:
> >> > >> > >> > > > > >
> >> > >> > >> > > > > > > Hi, Rajini,
> >> > >> > >> > > > > > >
> >> > >> > >> > > > > > > Thanks for the updated KIP. The latest proposal
> >> looks
> >> > >> good
> >> > >> > to
> >> > >> > >> me.
> >> > >> > >> > > > > > >
> >> > >> > >> > > > > > > Jun
> >> > >> > >> > > > > > >
> >> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> >> > >> > >> > > > > rajinisivaram@gmail.com
> >> > >> > >> > > > > > >
> >> > >> > >> > > > > > > wrote:
> >> > >> > >> > > > > > >
> >> > >> > >> > > > > > > > Jun/Roger,
> >> > >> > >> > > > > > > >
> >> > >> > >> > > > > > > > Thank you for the feedback.
> >> > >> > >> > > > > > > >
> >> > >> > >> > > > > > > > 1. I have updated the KIP to use absolute units
> >> > >> instead of
> >> > >> > >> > > > > percentage.
> >> > >> > >> > > > > > > The
> >> > >> > >> > > > > > > > property is called* io_thread_units* to align
> with
> >> > the
> >> > >> > >> thread
> >> > >> > >> > > count
> >> > >> > >> > > > > > > > property *num.io.threads*. When we implement
> >> network
> >> > >> > thread
> >> > >> > >> > > > > utilization
> >> > >> > >> > > > > > > > quotas, we can add another property
> >> > >> > *network_thread_units.*
> >> > >> > >> > > > > > > >
> >> > >> > >> > > > > > > > 2. ControlledShutdown is already listed under
> the
> >> > >> exempt
> >> > >> > >> > > requests.
> >> > >> > >> > > > > Jun,
> >> > >> > >> > > > > > > did
> >> > >> > >> > > > > > > > you mean a different request that needs to be
> >> added?
> >> > >> The
> >> > >> > >> four
> >> > >> > >> > > > > requests
> >> > >> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> >> > >> > >> > ControlledShutdown,
> >> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are
> >> controlled
> >> > >> > using
> >> > >> > >> > > > > > ClusterAction
> >> > >> > >> > > > > > > > ACL, so it is easy to exclude and only throttle
> if
> >> > >> > >> > unauthorized.
> >> > >> > >> > > I
> >> > >> > >> > > > > > wasn't
> >> > >> > >> > > > > > > > sure if there are other requests used only for
> >> > >> > inter-broker
> >> > >> > >> > that
> >> > >> > >> > > > > needed
> >> > >> > >> > > > > > > to
> >> > >> > >> > > > > > > > be excluded.
> >> > >> > >> > > > > > > >
> >> > >> > >> > > > > > > > 3. I was thinking the smallest change would be
> to
> >> > >> replace
> >> > >> > >> all
> >> > >> > >> > > > > > references
> >> > >> > >> > > > > > > to
> >> > >> > >> > > > > > > > *requestChannel.sendResponse()* with a local
> >> method
> >> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the
> >> > throttling
> >> > >> if
> >> > >> > >> any
> >> > >> > >> > > plus
> >> > >> > >> > > > > send
> >> > >> > >> > > > > > > > response. If we throttle first in
> >> > *KafkaApis.handle()*,
> >> > >> > the
> >> > >> > >> > time
> >> > >> > >> > > > > spent
> >> > >> > >> > > > > > > > within the method handling the request will not
> be
> >> > >> > recorded
> >> > >> > >> or
> >> > >> > >> > > used
> >> > >> > >> > > > > in
> >> > >> > >> > > > > > > > throttling. We can look into this again when the
> >> PR
> >> > is
> >> > >> > ready
> >> > >> > >> > for
> >> > >> > >> > > > > > review.
> >> > >> > >> > > > > > > >
> >> > >> > >> > > > > > > > Regards,
> >> > >> > >> > > > > > > >
> >> > >> > >> > > > > > > > Rajini
> >> > >> > >> > > > > > > >
> >> > >> > >> > > > > > > >
> >> > >> > >> > > > > > > >
> >> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> >> > >> > >> > > > > roger.hoover@gmail.com>
> >> > >> > >> > > > > > > > wrote:
> >> > >> > >> > > > > > > >
> >> > >> > >> > > > > > > > > Great to see this KIP and the excellent
> >> discussion.
> >> > >> > >> > > > > > > > >
> >> > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
> >> > >> application
> >> > >> > is
> >> > >> > >> > > > > allocated
> >> > >> > >> > > > > > 1
> >> > >> > >> > > > > > > > > request handler unit, then it's as if I have a
> >> > Kafka
> >> > >> > >> broker
> >> > >> > >> > > with
> >> > >> > >> > > > a
> >> > >> > >> > > > > > > single
> >> > >> > >> > > > > > > > > request handler thread dedicated to me.
> That's
> >> the
> >> > >> > most I
> >> > >> > >> > can
> >> > >> > >> > > > use,
> >> > >> > >> > > > > > at
> >> > >> > >> > > > > > > > > least.  That allocation doesn't change even if
> >> an
> >> > >> admin
> >> > >> > >> later
> >> > >> > >> > > > > > increases
> >> > >> > >> > > > > > > > the
> >> > >> > >> > > > > > > > > size of the request thread pool on the broker.
> >> > It's
> >> > >> > >> similar
> >> > >> > >> > to
> >> > >> > >> > > > the
> >> > >> > >> > > > > > CPU
> >> > >> > >> > > > > > > > > abstraction that VMs and containers get from
> >> > >> hypervisors
> >> > >> > >> or
> >> > >> > >> > OS
> >> > >> > >> > > > > > > > schedulers.
> >> > >> > >> > > > > > > > > While different client access patterns can use
> >> > wildly
> >> > >> > >> > different
> >> > >> > >> > > > > > amounts
> >> > >> > >> > > > > > > > of
> >> > >> > >> > > > > > > > > request thread resources per request, a given
> >> > >> > application
> >> > >> > >> > will
> >> > >> > >> > > > > > > generally
> >> > >> > >> > > > > > > > > have a stable access pattern and can figure
> out
> >> > >> > >> empirically
> >> > >> > >> > how
> >> > >> > >> > > > > many
> >> > >> > >> > > > > > > > > "request thread units" it needs to meet it's
> >> > >> > >> > throughput/latency
> >> > >> > >> > > > > > goals.
> >> > >> > >> > > > > > > > >
> >> > >> > >> > > > > > > > > Cheers,
> >> > >> > >> > > > > > > > >
> >> > >> > >> > > > > > > > > Roger
> >> > >> > >> > > > > > > > >
> >> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> >> > >> > >> jun@confluent.io>
> >> > >> > >> > > > wrote:
> >> > >> > >> > > > > > > > >
> >> > >> > >> > > > > > > > > > Hi, Rajini,
> >> > >> > >> > > > > > > > > >
> >> > >> > >> > > > > > > > > > Thanks for the updated KIP. A few more
> >> comments.
> >> > >> > >> > > > > > > > > >
> >> > >> > >> > > > > > > > > > 1. A concern of request_time_percent is that
> >> it's
> >> > >> not
> >> > >> > an
> >> > >> > >> > > > absolute
> >> > >> > >> > > > > > > > value.
> >> > >> > >> > > > > > > > > > Let's say you give a user a 10% limit. If
> the
> >> > admin
> >> > >> > >> doubles
> >> > >> > >> > > the
> >> > >> > >> > > > > > > number
> >> > >> > >> > > > > > > > of
> >> > >> > >> > > > > > > > > > request handler threads, that user now
> >> actually
> >> > has
> >> > >> > >> twice
> >> > >> > >> > the
> >> > >> > >> > > > > > > absolute
> >> > >> > >> > > > > > > > > > capacity. This may confuse people a bit. So,
> >> > >> perhaps
> >> > >> > >> > setting
> >> > >> > >> > > > the
> >> > >> > >> > > > > > > quota
> >> > >> > >> > > > > > > > > > based on an absolute request thread unit is
> >> > better.
> >> > >> > >> > > > > > > > > >
> >> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
> >> > >> inter-broker
> >> > >> > >> > request
> >> > >> > >> > > > and
> >> > >> > >> > > > > > > needs
> >> > >> > >> > > > > > > > to
> >> > >> > >> > > > > > > > > > be excluded from throttling.
> >> > >> > >> > > > > > > > > >
> >> > >> > >> > > > > > > > > > 3. Implementation wise, I am wondering if
> it's
> >> > >> simpler
> >> > >> > >> to
> >> > >> > >> > > apply
> >> > >> > >> > > > > the
> >> > >> > >> > > > > > > > > request
> >> > >> > >> > > > > > > > > > time throttling first in KafkaApis.handle().
> >> > >> > Otherwise,
> >> > >> > >> we
> >> > >> > >> > > will
> >> > >> > >> > > > > > need
> >> > >> > >> > > > > > > to
> >> > >> > >> > > > > > > > > add
> >> > >> > >> > > > > > > > > > the throttling logic in each type of
> request.
> >> > >> > >> > > > > > > > > >
> >> > >> > >> > > > > > > > > > Thanks,
> >> > >> > >> > > > > > > > > >
> >> > >> > >> > > > > > > > > > Jun
> >> > >> > >> > > > > > > > > >
> >> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini
> >> Sivaram <
> >> > >> > >> > > > > > > > rajinisivaram@gmail.com
> >> > >> > >> > > > > > > > > >
> >> > >> > >> > > > > > > > > > wrote:
> >> > >> > >> > > > > > > > > >
> >> > >> > >> > > > > > > > > > > Jun,
> >> > >> > >> > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > Thank you for the review.
> >> > >> > >> > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > I have reverted to the original KIP that
> >> > >> throttles
> >> > >> > >> based
> >> > >> > >> > on
> >> > >> > >> > > > > > request
> >> > >> > >> > > > > > > > > > handler
> >> > >> > >> > > > > > > > > > > utilization. At the moment, it uses
> >> percentage,
> >> > >> but
> >> > >> > I
> >> > >> > >> am
> >> > >> > >> > > > happy
> >> > >> > >> > > > > to
> >> > >> > >> > > > > > > > > change
> >> > >> > >> > > > > > > > > > to
> >> > >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
> >> > >> required. I
> >> > >> > >> have
> >> > >> > >> > > > added
> >> > >> > >> > > > > > the
> >> > >> > >> > > > > > > > > > examples
> >> > >> > >> > > > > > > > > > > from this discussion to the KIP. Also
> added
> >> a
> >> > >> > "Future
> >> > >> > >> > Work"
> >> > >> > >> > > > > > section
> >> > >> > >> > > > > > > > to
> >> > >> > >> > > > > > > > > > > address network thread utilization. The
> >> > >> > configuration
> >> > >> > >> is
> >> > >> > >> > > > named
> >> > >> > >> > > > > > > > > > > "request_time_percent" with the
> expectation
> >> > that
> >> > >> it
> >> > >> > >> can
> >> > >> > >> > > also
> >> > >> > >> > > > be
> >> > >> > >> > > > > > > used
> >> > >> > >> > > > > > > > as
> >> > >> > >> > > > > > > > > > the
> >> > >> > >> > > > > > > > > > > limit for network thread utilization when
> >> that
> >> > is
> >> > >> > >> > > > implemented,
> >> > >> > >> > > > > so
> >> > >> > >> > > > > > > > that
> >> > >> > >> > > > > > > > > > > users have to set only one config for the
> >> two
> >> > and
> >> > >> > not
> >> > >> > >> > have
> >> > >> > >> > > to
> >> > >> > >> > > > > > worry
> >> > >> > >> > > > > > > > > about
> >> > >> > >> > > > > > > > > > > the internal distribution of the work
> >> between
> >> > the
> >> > >> > two
> >> > >> > >> > > thread
> >> > >> > >> > > > > > pools
> >> > >> > >> > > > > > > in
> >> > >> > >> > > > > > > > > > > Kafka.
> >> > >> > >> > > > > > > > > > >
> >> > >> > >> > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > Regards,
> >> > >> > >> > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > Rajini
> >> > >> > >> > > > > > > > > > >
> >> > >> > >> > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao
> <
> >> > >> > >> > > jun@confluent.io>
> >> > >> > >> > > > > > > wrote:
> >> > >> > >> > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > Hi, Rajini,
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > The benefit of using the request
> >> processing
> >> > >> time
> >> > >> > >> over
> >> > >> > >> > the
> >> > >> > >> > > > > > request
> >> > >> > >> > > > > > > > > rate
> >> > >> > >> > > > > > > > > > is
> >> > >> > >> > > > > > > > > > > > exactly what people have said. I will
> just
> >> > >> expand
> >> > >> > >> that
> >> > >> > >> > a
> >> > >> > >> > > > bit.
> >> > >> > >> > > > > > > > > Consider
> >> > >> > >> > > > > > > > > > > the
> >> > >> > >> > > > > > > > > > > > following case. The producer sends a
> >> produce
> >> > >> > request
> >> > >> > >> > > with a
> >> > >> > >> > > > > > 10MB
> >> > >> > >> > > > > > > > > > message
> >> > >> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> >> > >> > >> decompression of
> >> > >> > >> > > the
> >> > >> > >> > > > > > > message
> >> > >> > >> > > > > > > > > on
> >> > >> > >> > > > > > > > > > > the
> >> > >> > >> > > > > > > > > > > > broker could take 10-15 seconds, during
> >> which
> >> > >> > time,
> >> > >> > >> a
> >> > >> > >> > > > request
> >> > >> > >> > > > > > > > handler
> >> > >> > >> > > > > > > > > > > > thread is completely blocked. In this
> >> case,
> >> > >> > neither
> >> > >> > >> the
> >> > >> > >> > > > > byte-in
> >> > >> > >> > > > > > > > quota
> >> > >> > >> > > > > > > > > > nor
> >> > >> > >> > > > > > > > > > > > the request rate quota may be effective
> in
> >> > >> > >> protecting
> >> > >> > >> > the
> >> > >> > >> > > > > > broker.
> >> > >> > >> > > > > > > > > > > Consider
> >> > >> > >> > > > > > > > > > > > another case. A consumer group starts
> >> with 10
> >> > >> > >> instances
> >> > >> > >> > > and
> >> > >> > >> > > > > > later
> >> > >> > >> > > > > > > > on
> >> > >> > >> > > > > > > > > > > > switches to 20 instances. The request
> rate
> >> > will
> >> > >> > >> likely
> >> > >> > >> > > > > double,
> >> > >> > >> > > > > > > but
> >> > >> > >> > > > > > > > > the
> >> > >> > >> > > > > > > > > > > > actually load on the broker may not
> double
> >> > >> since
> >> > >> > >> each
> >> > >> > >> > > fetch
> >> > >> > >> > > > > > > request
> >> > >> > >> > > > > > > > > > only
> >> > >> > >> > > > > > > > > > > > contains half of the partitions. Request
> >> rate
> >> > >> > quota
> >> > >> > >> may
> >> > >> > >> > > not
> >> > >> > >> > > > > be
> >> > >> > >> > > > > > > easy
> >> > >> > >> > > > > > > > > to
> >> > >> > >> > > > > > > > > > > > configure in this case.
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > What we really want is to be able to
> >> prevent
> >> > a
> >> > >> > >> client
> >> > >> > >> > > from
> >> > >> > >> > > > > > using
> >> > >> > >> > > > > > > > too
> >> > >> > >> > > > > > > > > > much
> >> > >> > >> > > > > > > > > > > > of the server side resources. In this
> >> > >> particular
> >> > >> > >> KIP,
> >> > >> > >> > > this
> >> > >> > >> > > > > > > resource
> >> > >> > >> > > > > > > > > is
> >> > >> > >> > > > > > > > > > > the
> >> > >> > >> > > > > > > > > > > > capacity of the request handler
> threads. I
> >> > >> agree
> >> > >> > >> that
> >> > >> > >> > it
> >> > >> > >> > > > may
> >> > >> > >> > > > > > not
> >> > >> > >> > > > > > > be
> >> > >> > >> > > > > > > > > > > > intuitive for the users to determine how
> >> to
> >> > set
> >> > >> > the
> >> > >> > >> > right
> >> > >> > >> > > > > > limit.
> >> > >> > >> > > > > > > > > > However,
> >> > >> > >> > > > > > > > > > > > this is not completely new and has been
> >> done
> >> > in
> >> > >> > the
> >> > >> > >> > > > container
> >> > >> > >> > > > > > > world
> >> > >> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> >> > >> > >> > > > > https://access.redhat.com/
> >> > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> >> > >> > >> terprise_Linux/6/html/
> >> > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html)
> >> has
> >> > >> the
> >> > >> > >> > concept
> >> > >> > >> > > of
> >> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> >> > >> > >> > > > > > > > > > > > which specifies the total amount of time
> >> in
> >> > >> > >> > microseconds
> >> > >> > >> > > > for
> >> > >> > >> > > > > > > which
> >> > >> > >> > > > > > > > > all
> >> > >> > >> > > > > > > > > > > > tasks in a cgroup can run during a one
> >> second
> >> > >> > >> period.
> >> > >> > >> > We
> >> > >> > >> > > > can
> >> > >> > >> > > > > > > > > > potentially
> >> > >> > >> > > > > > > > > > > > model the request handler threads in a
> >> > similar
> >> > >> > way.
> >> > >> > >> For
> >> > >> > >> > > > > > example,
> >> > >> > >> > > > > > > > each
> >> > >> > >> > > > > > > > > > > > request handler thread can be 1 request
> >> > handler
> >> > >> > unit
> >> > >> > >> > and
> >> > >> > >> > > > the
> >> > >> > >> > > > > > > admin
> >> > >> > >> > > > > > > > > can
> >> > >> > >> > > > > > > > > > > > configure a limit on how many units (say
> >> > 0.01)
> >> > >> a
> >> > >> > >> client
> >> > >> > >> > > can
> >> > >> > >> > > > > > have.
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > Regarding not throttling the internal
> >> broker
> >> > to
> >> > >> > >> broker
> >> > >> > >> > > > > > requests.
> >> > >> > >> > > > > > > We
> >> > >> > >> > > > > > > > > > could
> >> > >> > >> > > > > > > > > > > > do that. Alternatively, we could just
> let
> >> the
> >> > >> > admin
> >> > >> > >> > > > > configure a
> >> > >> > >> > > > > > > > high
> >> > >> > >> > > > > > > > > > > limit
> >> > >> > >> > > > > > > > > > > > for the kafka user (it may not be able
> to
> >> do
> >> > >> that
> >> > >> > >> > easily
> >> > >> > >> > > > > based
> >> > >> > >> > > > > > on
> >> > >> > >> > > > > > > > > > > clientId
> >> > >> > >> > > > > > > > > > > > though).
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > Ideally we want to be able to protect
> the
> >> > >> > >> utilization
> >> > >> > >> > of
> >> > >> > >> > > > the
> >> > >> > >> > > > > > > > network
> >> > >> > >> > > > > > > > > > > thread
> >> > >> > >> > > > > > > > > > > > pool too. The difficult is mostly what
> >> Rajini
> >> > >> > said:
> >> > >> > >> (1)
> >> > >> > >> > > The
> >> > >> > >> > > > > > > > mechanism
> >> > >> > >> > > > > > > > > > for
> >> > >> > >> > > > > > > > > > > > throttling the requests is through
> >> Purgatory
> >> > >> and
> >> > >> > we
> >> > >> > >> > will
> >> > >> > >> > > > have
> >> > >> > >> > > > > > to
> >> > >> > >> > > > > > > > > think
> >> > >> > >> > > > > > > > > > > > through how to integrate that into the
> >> > network
> >> > >> > >> layer.
> >> > >> > >> > > (2)
> >> > >> > >> > > > In
> >> > >> > >> > > > > > the
> >> > >> > >> > > > > > > > > > network
> >> > >> > >> > > > > > > > > > > > layer, currently we know the user, but
> not
> >> > the
> >> > >> > >> clientId
> >> > >> > >> > > of
> >> > >> > >> > > > > the
> >> > >> > >> > > > > > > > > request.
> >> > >> > >> > > > > > > > > > > So,
> >> > >> > >> > > > > > > > > > > > it's a bit tricky to throttle based on
> >> > clientId
> >> > >> > >> there.
> >> > >> > >> > > > Plus,
> >> > >> > >> > > > > > the
> >> > >> > >> > > > > > > > > > byteOut
> >> > >> > >> > > > > > > > > > > > quota can already protect the network
> >> thread
> >> > >> > >> > utilization
> >> > >> > >> > > > for
> >> > >> > >> > > > > > > fetch
> >> > >> > >> > > > > > > > > > > > requests. So, if we can't figure out
> this
> >> > part
> >> > >> > right
> >> > >> > >> > now,
> >> > >> > >> > > > > just
> >> > >> > >> > > > > > > > > focusing
> >> > >> > >> > > > > > > > > > > on
> >> > >> > >> > > > > > > > > > > > the request handling threads for this
> KIP
> >> is
> >> > >> > still a
> >> > >> > >> > > useful
> >> > >> > >> > > > > > > > feature.
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > Thanks,
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > Jun
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini
> >> > >> Sivaram <
> >> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > wrote:
> >> > >> > >> > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > Thank you all for the feedback.
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > Jay: I have removed exemption for
> >> consumer
> >> > >> > >> heartbeat
> >> > >> > >> > > etc.
> >> > >> > >> > > > > > Agree
> >> > >> > >> > > > > > > > > that
> >> > >> > >> > > > > > > > > > > > > protecting the cluster is more
> important
> >> > than
> >> > >> > >> > > protecting
> >> > >> > >> > > > > > > > individual
> >> > >> > >> > > > > > > > > > > apps.
> >> > >> > >> > > > > > > > > > > > > Have retained the exemption for
> >> > >> > >> > > StopReplicat/LeaderAndIsr
> >> > >> > >> > > > > > etc,
> >> > >> > >> > > > > > > > > these
> >> > >> > >> > > > > > > > > > > are
> >> > >> > >> > > > > > > > > > > > > throttled only if authorization fails
> >> (so
> >> > >> can't
> >> > >> > be
> >> > >> > >> > used
> >> > >> > >> > > > for
> >> > >> > >> > > > > > DoS
> >> > >> > >> > > > > > > > > > attacks
> >> > >> > >> > > > > > > > > > > > in
> >> > >> > >> > > > > > > > > > > > > a secure cluster, but allows
> >> inter-broker
> >> > >> > >> requests to
> >> > >> > >> > > > > > complete
> >> > >> > >> > > > > > > > > > without
> >> > >> > >> > > > > > > > > > > > > delays).
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > I will wait another day to see if
> these
> >> is
> >> > >> any
> >> > >> > >> > > objection
> >> > >> > >> > > > to
> >> > >> > >> > > > > > > > quotas
> >> > >> > >> > > > > > > > > > > based
> >> > >> > >> > > > > > > > > > > > on
> >> > >> > >> > > > > > > > > > > > > request processing time (as opposed to
> >> > >> request
> >> > >> > >> rate)
> >> > >> > >> > > and
> >> > >> > >> > > > if
> >> > >> > >> > > > > > > there
> >> > >> > >> > > > > > > > > are
> >> > >> > >> > > > > > > > > > > no
> >> > >> > >> > > > > > > > > > > > > objections, I will revert to the
> >> original
> >> > >> > proposal
> >> > >> > >> > with
> >> > >> > >> > > > > some
> >> > >> > >> > > > > > > > > changes.
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > The original proposal was only
> including
> >> > the
> >> > >> > time
> >> > >> > >> > used
> >> > >> > >> > > by
> >> > >> > >> > > > > the
> >> > >> > >> > > > > > > > > request
> >> > >> > >> > > > > > > > > > > > > handler threads (that made calculation
> >> > >> easy). I
> >> > >> > >> think
> >> > >> > >> > > the
> >> > >> > >> > > > > > > > > suggestion
> >> > >> > >> > > > > > > > > > is
> >> > >> > >> > > > > > > > > > > > to
> >> > >> > >> > > > > > > > > > > > > include the time spent in the network
> >> > >> threads as
> >> > >> > >> well
> >> > >> > >> > > > since
> >> > >> > >> > > > > > > that
> >> > >> > >> > > > > > > > > may
> >> > >> > >> > > > > > > > > > be
> >> > >> > >> > > > > > > > > > > > > significant. As Jay pointed out, it is
> >> more
> >> > >> > >> > complicated
> >> > >> > >> > > > to
> >> > >> > >> > > > > > > > > calculate
> >> > >> > >> > > > > > > > > > > the
> >> > >> > >> > > > > > > > > > > > > total available CPU time and convert
> to
> >> a
> >> > >> ratio
> >> > >> > >> when
> >> > >> > >> > > > there
> >> > >> > >> > > > > > *m*
> >> > >> > >> > > > > > > > I/O
> >> > >> > >> > > > > > > > > > > > threads
> >> > >> > >> > > > > > > > > > > > > and *n* network threads.
> >> > >> > >> > ThreadMXBean#getThreadCPUTime(
> >> > >> > >> > > )
> >> > >> > >> > > > > may
> >> > >> > >> > > > > > > > give
> >> > >> > >> > > > > > > > > us
> >> > >> > >> > > > > > > > > > > > what
> >> > >> > >> > > > > > > > > > > > > we want, but it can be very expensive
> on
> >> > some
> >> > >> > >> > > platforms.
> >> > >> > >> > > > As
> >> > >> > >> > > > > > > > Becket
> >> > >> > >> > > > > > > > > > and
> >> > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have
> >> > several
> >> > >> > time
> >> > >> > >> > > > > > measurements
> >> > >> > >> > > > > > > > > > already
> >> > >> > >> > > > > > > > > > > > for
> >> > >> > >> > > > > > > > > > > > > generating metrics that we could use,
> >> > though
> >> > >> we
> >> > >> > >> might
> >> > >> > >> > > > want
> >> > >> > >> > > > > to
> >> > >> > >> > > > > > > > > switch
> >> > >> > >> > > > > > > > > > to
> >> > >> > >> > > > > > > > > > > > > nanoTime() instead of
> >> currentTimeMillis()
> >> > >> since
> >> > >> > >> some
> >> > >> > >> > of
> >> > >> > >> > > > the
> >> > >> > >> > > > > > > > values
> >> > >> > >> > > > > > > > > > for
> >> > >> > >> > > > > > > > > > > > > small requests may be < 1ms. But
> rather
> >> > than
> >> > >> add
> >> > >> > >> up
> >> > >> > >> > the
> >> > >> > >> > > > > time
> >> > >> > >> > > > > > > > spent
> >> > >> > >> > > > > > > > > in
> >> > >> > >> > > > > > > > > > > I/O
> >> > >> > >> > > > > > > > > > > > > thread and network thread, wouldn't it
> >> be
> >> > >> better
> >> > >> > >> to
> >> > >> > >> > > > convert
> >> > >> > >> > > > > > the
> >> > >> > >> > > > > > > > > time
> >> > >> > >> > > > > > > > > > > > spent
> >> > >> > >> > > > > > > > > > > > > on each thread into a separate ratio?
> >> UserA
> >> > >> has
> >> > >> > a
> >> > >> > >> > > request
> >> > >> > >> > > > > > quota
> >> > >> > >> > > > > > > > of
> >> > >> > >> > > > > > > > > > 5%.
> >> > >> > >> > > > > > > > > > > > Can
> >> > >> > >> > > > > > > > > > > > > we take that to mean that UserA can
> use
> >> 5%
> >> > of
> >> > >> > the
> >> > >> > >> > time
> >> > >> > >> > > on
> >> > >> > >> > > > > > > network
> >> > >> > >> > > > > > > > > > > threads
> >> > >> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If
> >> > either
> >> > >> is
> >> > >> > >> > > exceeded,
> >> > >> > >> > > > > the
> >> > >> > >> > > > > > > > > > response
> >> > >> > >> > > > > > > > > > > is
> >> > >> > >> > > > > > > > > > > > > throttled - it would mean maintaining
> >> two
> >> > >> sets
> >> > >> > of
> >> > >> > >> > > metrics
> >> > >> > >> > > > > for
> >> > >> > >> > > > > > > the
> >> > >> > >> > > > > > > > > two
> >> > >> > >> > > > > > > > > > > > > durations, but would result in more
> >> > >> meaningful
> >> > >> > >> > ratios.
> >> > >> > >> > > We
> >> > >> > >> > > > > > could
> >> > >> > >> > > > > > > > > > define
> >> > >> > >> > > > > > > > > > > > two
> >> > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request
> >> > threads
> >> > >> > and
> >> > >> > >> 10%
> >> > >> > >> > > of
> >> > >> > >> > > > > > > network
> >> > >> > >> > > > > > > > > > > > threads),
> >> > >> > >> > > > > > > > > > > > > but that seems unnecessary and harder
> to
> >> > >> explain
> >> > >> > >> to
> >> > >> > >> > > > users.
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > Back to why and how quotas are applied
> >> to
> >> > >> > network
> >> > >> > >> > > thread
> >> > >> > >> > > > > > > > > utilization:
> >> > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the time
> >> spent in
> >> > >> the
> >> > >> > >> > network
> >> > >> > >> > > > > > thread
> >> > >> > >> > > > > > > > may
> >> > >> > >> > > > > > > > > be
> >> > >> > >> > > > > > > > > > > > > significant and I can see the need to
> >> > include
> >> > >> > >> this.
> >> > >> > >> > Are
> >> > >> > >> > > > > there
> >> > >> > >> > > > > > > > other
> >> > >> > >> > > > > > > > > > > > > requests where the network thread
> >> > >> utilization is
> >> > >> > >> > > > > significant?
> >> > >> > >> > > > > > > In
> >> > >> > >> > > > > > > > > the
> >> > >> > >> > > > > > > > > > > case
> >> > >> > >> > > > > > > > > > > > > of fetch, request handler thread
> >> > utilization
> >> > >> > would
> >> > >> > >> > > > throttle
> >> > >> > >> > > > > > > > clients
> >> > >> > >> > > > > > > > > > > with
> >> > >> > >> > > > > > > > > > > > > high request rate, low data volume and
> >> > fetch
> >> > >> > byte
> >> > >> > >> > rate
> >> > >> > >> > > > > quota
> >> > >> > >> > > > > > > will
> >> > >> > >> > > > > > > > > > > > throttle
> >> > >> > >> > > > > > > > > > > > > clients with high data volume. Network
> >> > thread
> >> > >> > >> > > utilization
> >> > >> > >> > > > > is
> >> > >> > >> > > > > > > > > perhaps
> >> > >> > >> > > > > > > > > > > > > proportional to the data volume. I am
> >> > >> wondering
> >> > >> > >> if we
> >> > >> > >> > > > even
> >> > >> > >> > > > > > need
> >> > >> > >> > > > > > > > to
> >> > >> > >> > > > > > > > > > > > throttle
> >> > >> > >> > > > > > > > > > > > > based on network thread utilization or
> >> > >> whether
> >> > >> > the
> >> > >> > >> > data
> >> > >> > >> > > > > > volume
> >> > >> > >> > > > > > > > > quota
> >> > >> > >> > > > > > > > > > > > covers
> >> > >> > >> > > > > > > > > > > > > this case.
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > b) At the moment, we record and check
> >> for
> >> > >> quota
> >> > >> > >> > > violation
> >> > >> > >> > > > > at
> >> > >> > >> > > > > > > the
> >> > >> > >> > > > > > > > > same
> >> > >> > >> > > > > > > > > > > > time.
> >> > >> > >> > > > > > > > > > > > > If a quota is violated, the response
> is
> >> > >> delayed.
> >> > >> > >> > Using
> >> > >> > >> > > > > Jay'e
> >> > >> > >> > > > > > > > > example
> >> > >> > >> > > > > > > > > > of
> >> > >> > >> > > > > > > > > > > > > disk reads for fetches happening in
> the
> >> > >> network
> >> > >> > >> > thread,
> >> > >> > >> > > > We
> >> > >> > >> > > > > > > can't
> >> > >> > >> > > > > > > > > > record
> >> > >> > >> > > > > > > > > > > > and
> >> > >> > >> > > > > > > > > > > > > delay a response after the disk reads.
> >> We
> >> > >> could
> >> > >> > >> > record
> >> > >> > >> > > > the
> >> > >> > >> > > > > > time
> >> > >> > >> > > > > > > > > spent
> >> > >> > >> > > > > > > > > > > on
> >> > >> > >> > > > > > > > > > > > > the network thread when the response
> is
> >> > >> complete
> >> > >> > >> and
> >> > >> > >> > > > > > introduce
> >> > >> > >> > > > > > > a
> >> > >> > >> > > > > > > > > > delay
> >> > >> > >> > > > > > > > > > > > for
> >> > >> > >> > > > > > > > > > > > > handling a subsequent request
> (separate
> >> out
> >> > >> > >> recording
> >> > >> > >> > > and
> >> > >> > >> > > > > > quota
> >> > >> > >> > > > > > > > > > > violation
> >> > >> > >> > > > > > > > > > > > > handling in the case of network thread
> >> > >> > overload).
> >> > >> > >> > Does
> >> > >> > >> > > > that
> >> > >> > >> > > > > > > make
> >> > >> > >> > > > > > > > > > sense?
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > Regards,
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > Rajini
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM,
> Becket
> >> > Qin <
> >> > >> > >> > > > > > > > becket.qin@gmail.com>
> >> > >> > >> > > > > > > > > > > > wrote:
> >> > >> > >> > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > > Hey Jay,
> >> > >> > >> > > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU
> >> time
> >> > >> is a
> >> > >> > >> > little
> >> > >> > >> > > > > > > tricky. I
> >> > >> > >> > > > > > > > > am
> >> > >> > >> > > > > > > > > > > > > thinking
> >> > >> > >> > > > > > > > > > > > > > that maybe we can use the existing
> >> > request
> >> > >> > >> > > statistics.
> >> > >> > >> > > > > They
> >> > >> > >> > > > > > > are
> >> > >> > >> > > > > > > > > > > already
> >> > >> > >> > > > > > > > > > > > > > very detailed so we can probably see
> >> the
> >> > >> > >> > approximate
> >> > >> > >> > > > CPU
> >> > >> > >> > > > > > time
> >> > >> > >> > > > > > > > > from
> >> > >> > >> > > > > > > > > > > it,
> >> > >> > >> > > > > > > > > > > > > e.g.
> >> > >> > >> > > > > > > > > > > > > > something like (total_time -
> >> > >> > >> > > > request/response_queue_time
> >> > >> > >> > > > > -
> >> > >> > >> > > > > > > > > > > > remote_time).
> >> > >> > >> > > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > > I agree with Guozhang that when a
> >> user is
> >> > >> > >> throttled
> >> > >> > >> > > it
> >> > >> > >> > > > is
> >> > >> > >> > > > > > > > likely
> >> > >> > >> > > > > > > > > > that
> >> > >> > >> > > > > > > > > > > > we
> >> > >> > >> > > > > > > > > > > > > > need to see if anything has went
> wrong
> >> > >> first,
> >> > >> > >> and
> >> > >> > >> > if
> >> > >> > >> > > > the
> >> > >> > >> > > > > > > users
> >> > >> > >> > > > > > > > > are
> >> > >> > >> > > > > > > > > > > well
> >> > >> > >> > > > > > > > > > > > > > behaving and just need more
> >> resources, we
> >> > >> will
> >> > >> > >> have
> >> > >> > >> > > to
> >> > >> > >> > > > > bump
> >> > >> > >> > > > > > > up
> >> > >> > >> > > > > > > > > the
> >> > >> > >> > > > > > > > > > > > quota
> >> > >> > >> > > > > > > > > > > > > > for them. It is true that
> >> pre-allocating
> >> > >> CPU
> >> > >> > >> time
> >> > >> > >> > > quota
> >> > >> > >> > > > > > > > precisely
> >> > >> > >> > > > > > > > > > for
> >> > >> > >> > > > > > > > > > > > the
> >> > >> > >> > > > > > > > > > > > > > users is difficult. So in practice
> it
> >> > would
> >> > >> > >> > probably
> >> > >> > >> > > be
> >> > >> > >> > > > > > more
> >> > >> > >> > > > > > > > like
> >> > >> > >> > > > > > > > > > > first
> >> > >> > >> > > > > > > > > > > > > set
> >> > >> > >> > > > > > > > > > > > > > a relative high protective CPU time
> >> quota
> >> > >> for
> >> > >> > >> > > everyone
> >> > >> > >> > > > > and
> >> > >> > >> > > > > > > > > increase
> >> > >> > >> > > > > > > > > > > > that
> >> > >> > >> > > > > > > > > > > > > > for some individual clients on
> demand.
> >> > >> > >> > > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > > Thanks,
> >> > >> > >> > > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> >> > >> > >> > > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM,
> >> Guozhang
> >> > >> > Wang <
> >> > >> > >> > > > > > > > > wangguoz@gmail.com
> >> > >> > >> > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > wrote:
> >> > >> > >> > > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > > > This is a great proposal, glad to
> >> see
> >> > it
> >> > >> > >> > happening.
> >> > >> > >> > > > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU
> >> throttling, or
> >> > >> more
> >> > >> > >> > > > > specifically
> >> > >> > >> > > > > > > > > > > processing
> >> > >> > >> > > > > > > > > > > > > time
> >> > >> > >> > > > > > > > > > > > > > > ratio instead of the request rate
> >> > >> throttling
> >> > >> > >> as
> >> > >> > >> > > well.
> >> > >> > >> > > > > > > Becket
> >> > >> > >> > > > > > > > > has
> >> > >> > >> > > > > > > > > > > very
> >> > >> > >> > > > > > > > > > > > > > well
> >> > >> > >> > > > > > > > > > > > > > > summed my rationales above, and
> one
> >> > >> thing to
> >> > >> > >> add
> >> > >> > >> > > here
> >> > >> > >> > > > > is
> >> > >> > >> > > > > > > that
> >> > >> > >> > > > > > > > > the
> >> > >> > >> > > > > > > > > > > > > former
> >> > >> > >> > > > > > > > > > > > > > > has a good support for both
> >> "protecting
> >> > >> > >> against
> >> > >> > >> > > rogue
> >> > >> > >> > > > > > > > clients"
> >> > >> > >> > > > > > > > > as
> >> > >> > >> > > > > > > > > > > > well
> >> > >> > >> > > > > > > > > > > > > as
> >> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> >> multi-tenancy
> >> > >> > usage":
> >> > >> > >> > when
> >> > >> > >> > > > > > > thinking
> >> > >> > >> > > > > > > > > > about
> >> > >> > >> > > > > > > > > > > > how
> >> > >> > >> > > > > > > > > > > > > to
> >> > >> > >> > > > > > > > > > > > > > > explain this to the end users, I
> >> find
> >> > it
> >> > >> > >> actually
> >> > >> > >> > > > more
> >> > >> > >> > > > > > > > natural
> >> > >> > >> > > > > > > > > > than
> >> > >> > >> > > > > > > > > > > > the
> >> > >> > >> > > > > > > > > > > > > > > request rate since as mentioned
> >> above,
> >> > >> > >> different
> >> > >> > >> > > > > requests
> >> > >> > >> > > > > > > > will
> >> > >> > >> > > > > > > > > > have
> >> > >> > >> > > > > > > > > > > > > quite
> >> > >> > >> > > > > > > > > > > > > > > different "cost", and Kafka today
> >> > already
> >> > >> > have
> >> > >> > >> > > > various
> >> > >> > >> > > > > > > > request
> >> > >> > >> > > > > > > > > > > types
> >> > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata,
> >> etc),
> >> > >> > >> because
> >> > >> > >> > of
> >> > >> > >> > > > that
> >> > >> > >> > > > > > the
> >> > >> > >> > > > > > > > > > request
> >> > >> > >> > > > > > > > > > > > > rate
> >> > >> > >> > > > > > > > > > > > > > > throttling may not be as effective
> >> > >> unless it
> >> > >> > >> is
> >> > >> > >> > set
> >> > >> > >> > > > > very
> >> > >> > >> > > > > > > > > > > > > conservatively.
> >> > >> > >> > > > > > > > > > > > > > >
> >> > >> > >> > > > > > > > > > > > > > > Regarding to user reactions when
> >> they
> >> > are
> >> > >> > >> > > throttled,
> >> > >> > >> > > > I
> >> > >> > >> > > > > > > think
> >> > >> > >> > > > > > > > it
> >> > >> > >> > > > > > > > > > may
> >> > >> > >> > > > > > > > > > > > > > differ
> >> > >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
> >> > discovered /
> >> > >> > >> guided
> >> > >> > >> > by
> >> > >> > >> > > > > > looking
> >> > >> > >> > > > > > > > at
> >> > >> > >> > > > > > > > > > > > relative
> >> > >> > >> > > > > > > > > > > > > > > metrics. So in other words users
> >> would
> >> > >> not
> >> > >> > >> expect
> >> > >> > >> > > to
> >> > >> > >> > > > > get
> >> > >> > >> > > > > > > > > > additional
> >> > >> > >> > > > > > > > > > > > > > > information by simply being told
> >> "hey,
> >> > >> you
> >> > >> > are
> >> > >> > >> > > > > > throttled",
> >> > >> > >> > > > > > > > > which
> >> > >> > >> > > > > > > > > > is
> >> > >> > >> > > > > > > > > > > > all
> >> > >> > >> > > > > > > > > > > > > > > what throttling does; they need to
> >> > take a
> >> > >> > >> > follow-up
> >> > >> > >> > > > > step
> >> > >> > >> > > > > > > and
> >> > >> > >> > > > > > > > > see
> >> > >> > >> > > > > > > > > > > > "hmm,
> >> > >> > >> > > > > > > > > > > > > > I'm
> >> > >> > >> > > > > > > > > > > > > > > throttled probably because of ..",
> >> > which
> >> > >> is
> >> > >> > by
> >> > >> > >> > > > looking
> >> > >> > >> > > > > at
> >> > >> > >> > > > > > > > other
> >> > >> > >> > > > > > > > > > > > metric
> >> > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm
> bombarding
> >> the
> >> > >> > >> brokers
> >> > >> > >> > > with
> >> > >> > >> > > > >
> >>
> > ...
> >
> > [Message clipped]
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Another way to express an absolute limit is to use request.percentage, but
treat it as the percentage used in a single request handling thread. For
now, the request handling threads can be just the io threads. In the
future, they can cover the network threads as well. This is similar to how
top reports CPU usage and may be a bit easier for people to understand.

Thanks,

Jun

On Fri, Feb 24, 2017 at 10:31 AM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Jay,
>
> 2. Regarding request.unit vs request.percentage. I started with
> request.percentage too. The reasoning for request.unit is the following.
> Suppose that the capacity has been reached on a broker and the admin needs
> to add a new user. A simple way to increase the capacity is to increase the
> number of io threads, assuming there are still enough cores. If the limit
> is based on percentage, the additional capacity automatically gets
> distributed to existing users and we haven't really carved out any
> additional resource for the new user. Now, is it easy for a user to reason
> about 0.1 unit vs 10%. My feeling is that both are hard and have to be
> configured empirically. Not sure if percentage is obviously easier to
> reason about.
>
> Thanks,
>
> Jun
>
> On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io> wrote:
>
>> A couple of quick points:
>>
>> 1. Even though the implementation of this quota is only using io thread
>> time, i think we should call it something like "request-time". This will
>> give us flexibility to improve the implementation to cover network threads
>> in the future and will avoid exposing internal details like our thread
>> pools on the server.
>>
>> 2. Jun/Roger, I get what you are trying to fix but the idea of
>> thread/units
>> is super unintuitive as a user-facing knob. I had to read the KIP like
>> eight times to understand this. I'm not sure that your point that
>> increasing the number of threads is a problem with a percentage-based
>> value, it really depends on whether the user thinks about the "percentage
>> of request processing time" or "thread units". If they think "I have
>> allocated 10% of my request processing time to user x" then it is a bug
>> that increasing the thread count decreases that percent as it does in the
>> current proposal. As a practical matter I think the only way to actually
>> reason about this is as a percent---I just don't believe people are going
>> to think, "ah, 4.3 thread units, that is the right amount!". Instead I
>> think they have to understand this thread unit concept, figure out what
>> they have set in number of threads, compute a percent and then come up
>> with
>> the number of thread units, and these will all be wrong if that thread
>> count changes. I also think this ties us to throttling the I/O thread
>> pool,
>> which may not be where we want to end up.
>>
>> 3. For what it's worth I do think having a single throttle_ms field in all
>> the responses that combines all throttling from all quotas is probably the
>> simplest. There could be a use case for having separate fields for each,
>> but I think that is actually harder to use/monitor in the common case so
>> unless someone has a use case I think just one should be fine.
>>
>> -Jay
>>
>> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <ra...@gmail.com>
>> wrote:
>>
>> > I have updated the KIP based on the discussions so far.
>> >
>> >
>> > Regards,
>> >
>> > Rajini
>> >
>> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
>> rajinisivaram@gmail.com>
>> > wrote:
>> >
>> > > Thank you all for the feedback.
>> > >
>> > > Ismael #1. It makes sense not to throttle inter-broker requests like
>> > > LeaderAndIsr etc. The simplest way to ensure that clients cannot use
>> > these
>> > > requests to bypass quotas for DoS attacks is to ensure that ACLs
>> prevent
>> > > clients from using these requests and unauthorized requests are
>> included
>> > > towards quotas.
>> > >
>> > > Ismael #2, Jay #1 : I was thinking that these quotas can return a
>> > separate
>> > > throttle time, and all utilization based quotas could use the same
>> field
>> > > (we won't add another one for network thread utilization for
>> instance).
>> > But
>> > > perhaps it makes sense to keep byte rate quotas separate in
>> produce/fetch
>> > > responses to provide separate metrics? Agree with Ismael that the
>> name of
>> > > the existing field should be changed if we have two. Happy to switch
>> to a
>> > > single combined throttle time if that is sufficient.
>> > >
>> > > Ismael #4, #5, #6: Will update KIP. Will use dot separated name for
>> new
>> > > property. Replication quotas use dot separated, so it will be
>> consistent
>> > > with all properties except byte rate quotas.
>> > >
>> > > Radai: #1 Request processing time rather than request rate were chosen
>> > > because the time per request can vary significantly between requests
>> as
>> > > mentioned in the discussion and KIP.
>> > > #2 Two separate quotas for heartbeats/regular requests feel like more
>> > > configuration and more metrics. Since most users would set quotas
>> higher
>> > > than the expected usage and quotas are more of a safety net, a single
>> > quota
>> > > should work in most cases.
>> > >  #3 The number of requests in purgatory is limited by the number of
>> > active
>> > > connections since only one request per connection will be throttled
>> at a
>> > > time.
>> > > #4 As with byte rate quotas, to use the full allocated quotas,
>> > > clients/users would need to use partitions that are distributed across
>> > the
>> > > cluster. The alternative of using cluster-wide quotas instead of
>> > per-broker
>> > > quotas would be far too complex to implement.
>> > >
>> > > Dong : We currently have two ClientQuotaManagers for quota types Fetch
>> > and
>> > > Produce. A new one will be added for IOThread, which manages quotas
>> for
>> > I/O
>> > > thread utilization. This will not update the Fetch or Produce
>> queue-size,
>> > > but will have a separate metric for the queue-size.  I wasn't
>> planning to
>> > > add any additional metrics apart from the equivalent ones for existing
>> > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
>> utilization
>> > > could be slightly misleading since it depends on the sequence of
>> > requests.
>> > > But we can look into more metrics after the KIP is implemented if
>> > required.
>> > >
>> > > I think we need to limit the maximum delay since all requests are
>> > > throttled. If a client has a quota of 0.001 units and a single request
>> > used
>> > > 50ms, we don't want to delay all requests from the client by 50
>> seconds,
>> > > throwing the client out of all its consumer groups. The issue is only
>> if
>> > a
>> > > user is allocated a quota that is insufficient to process one large
>> > > request. The expectation is that the units allocated per user will be
>> > much
>> > > higher than the time taken to process one request and the limit should
>> > > seldom be applied. Agree this needs proper documentation.
>> > >
>> > > Regards,
>> > >
>> > > Rajini
>> > >
>> > >
>> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <ra...@gmail.com>
>> > wrote:
>> > >
>> > >> @jun: i wasnt concerned about tying up a request processing thread,
>> but
>> > >> IIUC the code does still read the entire request out, which might
>> add-up
>> > >> to
>> > >> a non-negligible amount of memory.
>> > >>
>> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com>
>> wrote:
>> > >>
>> > >> > Hey Rajini,
>> > >> >
>> > >> > The current KIP says that the maximum delay will be reduced to
>> window
>> > >> size
>> > >> > if it is larger than the window size. I have a concern with this:
>> > >> >
>> > >> > 1) This essentially means that the user is allowed to exceed their
>> > quota
>> > >> > over a long period of time. Can you provide an upper bound on this
>> > >> > deviation?
>> > >> >
>> > >> > 2) What is the motivation for cap the maximum delay by the window
>> > size?
>> > >> I
>> > >> > am wondering if there is better alternative to address the problem.
>> > >> >
>> > >> > 3) It means that the existing metric-related config will have a
>> more
>> > >> > directly impact on the mechanism of this io-thread-unit-based
>> quota.
>> > The
>> > >> > may be an important change depending on the answer to 1) above. We
>> > >> probably
>> > >> > need to document this more explicitly.
>> > >> >
>> > >> > Dong
>> > >> >
>> > >> >
>> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <li...@gmail.com>
>> > wrote:
>> > >> >
>> > >> > > Hey Jun,
>> > >> > >
>> > >> > > Yeah you are right. I thought it wasn't because at LinkedIn it
>> will
>> > be
>> > >> > too
>> > >> > > much pressure on inGraph to expose those per-clientId metrics so
>> we
>> > >> ended
>> > >> > > up printing them periodically to local log. Never mind if it is
>> not
>> > a
>> > >> > > general problem.
>> > >> > >
>> > >> > > Hey Rajini,
>> > >> > >
>> > >> > > - I agree with Jay that we probably don't want to add a new field
>> > for
>> > >> > > every quota ProduceResponse or FetchResponse. Is there any
>> use-case
>> > >> for
>> > >> > > having separate throttle-time fields for byte-rate-quota and
>> > >> > > io-thread-unit-quota? You probably need to document this as
>> > interface
>> > >> > > change if you plan to add new field in any request.
>> > >> > >
>> > >> > > - I don't think IOThread belongs to quotaType. The existing quota
>> > >> types
>> > >> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication)
>> identify
>> > >> the
>> > >> > > type of request that are throttled, not the quota mechanism that
>> is
>> > >> > applied.
>> > >> > >
>> > >> > > - If a request is throttled due to this io-thread-unit-based
>> quota,
>> > is
>> > >> > the
>> > >> > > existing queue-size metric in ClientQuotaManager incremented?
>> > >> > >
>> > >> > > - In the interest of providing guide line for admin to decide
>> > >> > > io-thread-unit-based quota and for user to understand its impact
>> on
>> > >> their
>> > >> > > traffic, would it be useful to have a metric that shows the
>> overall
>> > >> > > byte-rate per io-thread-unit? Can we also show this a
>> per-clientId
>> > >> > metric?
>> > >> > >
>> > >> > > Thanks,
>> > >> > > Dong
>> > >> > >
>> > >> > >
>> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io>
>> wrote:
>> > >> > >
>> > >> > >> Hi, Ismael,
>> > >> > >>
>> > >> > >> For #3, typically, an admin won't configure more io threads than
>> > CPU
>> > >> > >> cores,
>> > >> > >> but it's possible for an admin to start with fewer io threads
>> than
>> > >> cores
>> > >> > >> and grow that later on.
>> > >> > >>
>> > >> > >> Hi, Dong,
>> > >> > >>
>> > >> > >> I think the throttleTime sensor on the broker tells the admin
>> > >> whether a
>> > >> > >> user/clentId is throttled or not.
>> > >> > >>
>> > >> > >> Hi, Radi,
>> > >> > >>
>> > >> > >> The reasoning for delaying the throttled requests on the broker
>> > >> instead
>> > >> > of
>> > >> > >> returning an error immediately is that the latter has no way to
>> > >> prevent
>> > >> > >> the
>> > >> > >> client from retrying immediately, which will make things worse.
>> The
>> > >> > >> delaying logic is based off a delay queue. A separate expiration
>> > >> thread
>> > >> > >> just waits on the next to be expired request. So, it doesn't tie
>> > up a
>> > >> > >> request handler thread.
>> > >> > >>
>> > >> > >> Thanks,
>> > >> > >>
>> > >> > >> Jun
>> > >> > >>
>> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <ismael@juma.me.uk
>> >
>> > >> wrote:
>> > >> > >>
>> > >> > >> > Hi Jay,
>> > >> > >> >
>> > >> > >> > Regarding 1, I definitely like the simplicity of keeping a
>> single
>> > >> > >> throttle
>> > >> > >> > time field in the response. The downside is that the client
>> > metrics
>> > >> > >> will be
>> > >> > >> > more coarse grained.
>> > >> > >> >
>> > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.percentage`
>> > and
>> > >> > >> > `log.cleaner.min.cleanable.ratio`.
>> > >> > >> >
>> > >> > >> > Ismael
>> > >> > >> >
>> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io>
>> > >> wrote:
>> > >> > >> >
>> > >> > >> > > A few minor comments:
>> > >> > >> > >
>> > >> > >> > >    1. Isn't it the case that the throttling time response
>> field
>> > >> > should
>> > >> > >> > have
>> > >> > >> > >    the total time your request was throttled irrespective of
>> > the
>> > >> > >> quotas
>> > >> > >> > > that
>> > >> > >> > >    caused that. Limiting it to byte rate quota doesn't make
>> > >> sense,
>> > >> > >> but I
>> > >> > >> > > also
>> > >> > >> > >    I don't think we want to end up adding new fields in the
>> > >> response
>> > >> > >> for
>> > >> > >> > > every
>> > >> > >> > >    single thing we quota, right?
>> > >> > >> > >    2. I don't think we should make this quota specifically
>> > about
>> > >> io
>> > >> > >> > >    threads. Once we introduce these quotas people set them
>> and
>> > >> > expect
>> > >> > >> > them
>> > >> > >> > > to
>> > >> > >> > >    be enforced (and if they aren't it may cause an outage).
>> As
>> > a
>> > >> > >> result
>> > >> > >> > > they
>> > >> > >> > >    are a bit more sensitive than normal configs, I think.
>> The
>> > >> > current
>> > >> > >> > > thread
>> > >> > >> > >    pools seem like something of an implementation detail and
>> > not
>> > >> the
>> > >> > >> > level
>> > >> > >> > > the
>> > >> > >> > >    user-facing quotas should be involved with. I think it
>> might
>> > >> be
>> > >> > >> better
>> > >> > >> > > to
>> > >> > >> > >    make this a general request-time throttle with no
>> mention in
>> > >> the
>> > >> > >> > naming
>> > >> > >> > >    about I/O threads and simply acknowledge the current
>> > >> limitation
>> > >> > >> (which
>> > >> > >> > > we
>> > >> > >> > >    may someday fix) in the docs that this covers only the
>> time
>> > >> after
>> > >> > >> the
>> > >> > >> > >    thread is read off the network.
>> > >> > >> > >    3. As such I think the right interface to the user would
>> be
>> > >> > >> something
>> > >> > >> > >    like percent_request_time and be in {0,...100} or
>> > >> > >> request_time_ratio
>> > >> > >> > > and be
>> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we
>> used
>> > >> if
>> > >> > the
>> > >> > >> > > scale
>> > >> > >> > >    is between 0 and 1 in the other metrics, right?)
>> > >> > >> > >
>> > >> > >> > > -Jay
>> > >> > >> > >
>> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
>> > >> > >> rajinisivaram@gmail.com
>> > >> > >> > >
>> > >> > >> > > wrote:
>> > >> > >> > >
>> > >> > >> > > > Guozhang/Dong,
>> > >> > >> > > >
>> > >> > >> > > > Thank you for the feedback.
>> > >> > >> > > >
>> > >> > >> > > > Guozhang : I have updated the section on co-existence of
>> byte
>> > >> rate
>> > >> > >> and
>> > >> > >> > > > request time quotas.
>> > >> > >> > > >
>> > >> > >> > > > Dong: I hadn't added much detail to the metrics and
>> sensors
>> > >> since
>> > >> > >> they
>> > >> > >> > > are
>> > >> > >> > > > going to be very similar to the existing metrics and
>> sensors.
>> > >> To
>> > >> > >> avoid
>> > >> > >> > > > confusion, I have now added more detail. All metrics are
>> in
>> > the
>> > >> > >> group
>> > >> > >> > > > "quotaType" and all sensors have names starting with
>> > >> "quotaType"
>> > >> > >> (where
>> > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
>> > >> > >> > > > FollowerReplication/*IOThread*).
>> > >> > >> > > > So there will be no reuse of existing metrics/sensors. The
>> > new
>> > >> > ones
>> > >> > >> for
>> > >> > >> > > > request processing time based throttling will be
>> completely
>> > >> > >> independent
>> > >> > >> > > of
>> > >> > >> > > > existing metrics/sensors, but will be consistent in
>> format.
>> > >> > >> > > >
>> > >> > >> > > > The existing throttle_time_ms field in produce/fetch
>> > responses
>> > >> > will
>> > >> > >> not
>> > >> > >> > > be
>> > >> > >> > > > impacted by this KIP. That will continue to return
>> byte-rate
>> > >> based
>> > >> > >> > > > throttling times. In addition, a new field
>> > >> > request_throttle_time_ms
>> > >> > >> > will
>> > >> > >> > > be
>> > >> > >> > > > added to return request quota based throttling times.
>> These
>> > >> will
>> > >> > be
>> > >> > >> > > exposed
>> > >> > >> > > > as new metrics on the client-side.
>> > >> > >> > > >
>> > >> > >> > > > Since all metrics and sensors are different for each type
>> of
>> > >> > quota,
>> > >> > >> I
>> > >> > >> > > > believe there is already sufficient metrics to monitor
>> > >> throttling
>> > >> > on
>> > >> > >> > both
>> > >> > >> > > > client and broker side for each type of throttling.
>> > >> > >> > > >
>> > >> > >> > > > Regards,
>> > >> > >> > > >
>> > >> > >> > > > Rajini
>> > >> > >> > > >
>> > >> > >> > > >
>> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
>> > lindong28@gmail.com
>> > >> >
>> > >> > >> wrote:
>> > >> > >> > > >
>> > >> > >> > > > > Hey Rajini,
>> > >> > >> > > > >
>> > >> > >> > > > > I think it makes a lot of sense to use io_thread_units
>> as
>> > >> metric
>> > >> > >> to
>> > >> > >> > > quota
>> > >> > >> > > > > user's traffic here. LGTM overall. I have some questions
>> > >> > regarding
>> > >> > >> > > > sensors.
>> > >> > >> > > > >
>> > >> > >> > > > > - Can you be more specific in the KIP what sensors will
>> be
>> > >> > added?
>> > >> > >> For
>> > >> > >> > > > > example, it will be useful to specify the name and
>> > >> attributes of
>> > >> > >> > these
>> > >> > >> > > > new
>> > >> > >> > > > > sensors.
>> > >> > >> > > > >
>> > >> > >> > > > > - We currently have throttle-time and queue-size for
>> > >> byte-rate
>> > >> > >> based
>> > >> > >> > > > quota.
>> > >> > >> > > > > Are you going to have separate throttle-time and
>> queue-size
>> > >> for
>> > >> > >> > > requests
>> > >> > >> > > > > throttled by io_thread_unit-based quota, or will they
>> share
>> > >> the
>> > >> > >> same
>> > >> > >> > > > > sensor?
>> > >> > >> > > > >
>> > >> > >> > > > > - Does the throttle-time in the ProduceResponse and
>> > >> > FetchResponse
>> > >> > >> > > > contains
>> > >> > >> > > > > time due to io_thread_unit-based quota?
>> > >> > >> > > > >
>> > >> > >> > > > > - Currently kafka server doesn't not provide any log or
>> > >> metrics
>> > >> > >> that
>> > >> > >> > > > tells
>> > >> > >> > > > > whether any given clientId (or user) is throttled. This
>> is
>> > >> not
>> > >> > too
>> > >> > >> > bad
>> > >> > >> > > > > because we can still check the client-side byte-rate
>> metric
>> > >> to
>> > >> > >> > validate
>> > >> > >> > > > > whether a given client is throttled. But with this
>> > >> > io_thread_unit,
>> > >> > >> > > there
>> > >> > >> > > > > will be no way to validate whether a given client is
>> slow
>> > >> > because
>> > >> > >> it
>> > >> > >> > > has
>> > >> > >> > > > > exceeded its io_thread_unit limit. It is necessary for
>> user
>> > >> to
>> > >> > be
>> > >> > >> > able
>> > >> > >> > > to
>> > >> > >> > > > > know this information to figure how whether they have
>> > reached
>> > >> > >> there
>> > >> > >> > > quota
>> > >> > >> > > > > limit. How about we add log4j log on the server side to
>> > >> > >> periodically
>> > >> > >> > > > print
>> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
>> > >> > >> > io-thread-unit-throttle-time)
>> > >> > >> > > so
>> > >> > >> > > > > that kafka administrator can figure those users that
>> have
>> > >> > reached
>> > >> > >> > their
>> > >> > >> > > > > limit and act accordingly?
>> > >> > >> > > > >
>> > >> > >> > > > > Thanks,
>> > >> > >> > > > > Dong
>> > >> > >> > > > >
>> > >> > >> > > > >
>> > >> > >> > > > >
>> > >> > >> > > > >
>> > >> > >> > > > >
>> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
>> > >> > >> wangguoz@gmail.com>
>> > >> > >> > > > wrote:
>> > >> > >> > > > >
>> > >> > >> > > > > > Made a pass over the doc, overall LGTM except a minor
>> > >> comment
>> > >> > on
>> > >> > >> > the
>> > >> > >> > > > > > throttling implementation:
>> > >> > >> > > > > >
>> > >> > >> > > > > > Stated as "Request processing time throttling will be
>> > >> applied
>> > >> > on
>> > >> > >> > top
>> > >> > >> > > if
>> > >> > >> > > > > > necessary." I thought that it meant the request
>> > processing
>> > >> > time
>> > >> > >> > > > > throttling
>> > >> > >> > > > > > is applied first, but continue reading I found it
>> > actually
>> > >> > >> meant to
>> > >> > >> > > > apply
>> > >> > >> > > > > > produce / fetch byte rate throttling first.
>> > >> > >> > > > > >
>> > >> > >> > > > > > Also the last sentence "The remaining delay if any is
>> > >> applied
>> > >> > to
>> > >> > >> > the
>> > >> > >> > > > > > response." is a bit confusing to me. Maybe rewording
>> it a
>> > >> bit?
>> > >> > >> > > > > >
>> > >> > >> > > > > >
>> > >> > >> > > > > > Guozhang
>> > >> > >> > > > > >
>> > >> > >> > > > > >
>> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
>> > jun@confluent.io
>> > >> >
>> > >> > >> wrote:
>> > >> > >> > > > > >
>> > >> > >> > > > > > > Hi, Rajini,
>> > >> > >> > > > > > >
>> > >> > >> > > > > > > Thanks for the updated KIP. The latest proposal
>> looks
>> > >> good
>> > >> > to
>> > >> > >> me.
>> > >> > >> > > > > > >
>> > >> > >> > > > > > > Jun
>> > >> > >> > > > > > >
>> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
>> > >> > >> > > > > rajinisivaram@gmail.com
>> > >> > >> > > > > > >
>> > >> > >> > > > > > > wrote:
>> > >> > >> > > > > > >
>> > >> > >> > > > > > > > Jun/Roger,
>> > >> > >> > > > > > > >
>> > >> > >> > > > > > > > Thank you for the feedback.
>> > >> > >> > > > > > > >
>> > >> > >> > > > > > > > 1. I have updated the KIP to use absolute units
>> > >> instead of
>> > >> > >> > > > > percentage.
>> > >> > >> > > > > > > The
>> > >> > >> > > > > > > > property is called* io_thread_units* to align with
>> > the
>> > >> > >> thread
>> > >> > >> > > count
>> > >> > >> > > > > > > > property *num.io.threads*. When we implement
>> network
>> > >> > thread
>> > >> > >> > > > > utilization
>> > >> > >> > > > > > > > quotas, we can add another property
>> > >> > *network_thread_units.*
>> > >> > >> > > > > > > >
>> > >> > >> > > > > > > > 2. ControlledShutdown is already listed under the
>> > >> exempt
>> > >> > >> > > requests.
>> > >> > >> > > > > Jun,
>> > >> > >> > > > > > > did
>> > >> > >> > > > > > > > you mean a different request that needs to be
>> added?
>> > >> The
>> > >> > >> four
>> > >> > >> > > > > requests
>> > >> > >> > > > > > > > currently exempt in the KIP are StopReplica,
>> > >> > >> > ControlledShutdown,
>> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are
>> controlled
>> > >> > using
>> > >> > >> > > > > > ClusterAction
>> > >> > >> > > > > > > > ACL, so it is easy to exclude and only throttle if
>> > >> > >> > unauthorized.
>> > >> > >> > > I
>> > >> > >> > > > > > wasn't
>> > >> > >> > > > > > > > sure if there are other requests used only for
>> > >> > inter-broker
>> > >> > >> > that
>> > >> > >> > > > > needed
>> > >> > >> > > > > > > to
>> > >> > >> > > > > > > > be excluded.
>> > >> > >> > > > > > > >
>> > >> > >> > > > > > > > 3. I was thinking the smallest change would be to
>> > >> replace
>> > >> > >> all
>> > >> > >> > > > > > references
>> > >> > >> > > > > > > to
>> > >> > >> > > > > > > > *requestChannel.sendResponse()* with a local
>> method
>> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the
>> > throttling
>> > >> if
>> > >> > >> any
>> > >> > >> > > plus
>> > >> > >> > > > > send
>> > >> > >> > > > > > > > response. If we throttle first in
>> > *KafkaApis.handle()*,
>> > >> > the
>> > >> > >> > time
>> > >> > >> > > > > spent
>> > >> > >> > > > > > > > within the method handling the request will not be
>> > >> > recorded
>> > >> > >> or
>> > >> > >> > > used
>> > >> > >> > > > > in
>> > >> > >> > > > > > > > throttling. We can look into this again when the
>> PR
>> > is
>> > >> > ready
>> > >> > >> > for
>> > >> > >> > > > > > review.
>> > >> > >> > > > > > > >
>> > >> > >> > > > > > > > Regards,
>> > >> > >> > > > > > > >
>> > >> > >> > > > > > > > Rajini
>> > >> > >> > > > > > > >
>> > >> > >> > > > > > > >
>> > >> > >> > > > > > > >
>> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
>> > >> > >> > > > > roger.hoover@gmail.com>
>> > >> > >> > > > > > > > wrote:
>> > >> > >> > > > > > > >
>> > >> > >> > > > > > > > > Great to see this KIP and the excellent
>> discussion.
>> > >> > >> > > > > > > > >
>> > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
>> > >> application
>> > >> > is
>> > >> > >> > > > > allocated
>> > >> > >> > > > > > 1
>> > >> > >> > > > > > > > > request handler unit, then it's as if I have a
>> > Kafka
>> > >> > >> broker
>> > >> > >> > > with
>> > >> > >> > > > a
>> > >> > >> > > > > > > single
>> > >> > >> > > > > > > > > request handler thread dedicated to me.  That's
>> the
>> > >> > most I
>> > >> > >> > can
>> > >> > >> > > > use,
>> > >> > >> > > > > > at
>> > >> > >> > > > > > > > > least.  That allocation doesn't change even if
>> an
>> > >> admin
>> > >> > >> later
>> > >> > >> > > > > > increases
>> > >> > >> > > > > > > > the
>> > >> > >> > > > > > > > > size of the request thread pool on the broker.
>> > It's
>> > >> > >> similar
>> > >> > >> > to
>> > >> > >> > > > the
>> > >> > >> > > > > > CPU
>> > >> > >> > > > > > > > > abstraction that VMs and containers get from
>> > >> hypervisors
>> > >> > >> or
>> > >> > >> > OS
>> > >> > >> > > > > > > > schedulers.
>> > >> > >> > > > > > > > > While different client access patterns can use
>> > wildly
>> > >> > >> > different
>> > >> > >> > > > > > amounts
>> > >> > >> > > > > > > > of
>> > >> > >> > > > > > > > > request thread resources per request, a given
>> > >> > application
>> > >> > >> > will
>> > >> > >> > > > > > > generally
>> > >> > >> > > > > > > > > have a stable access pattern and can figure out
>> > >> > >> empirically
>> > >> > >> > how
>> > >> > >> > > > > many
>> > >> > >> > > > > > > > > "request thread units" it needs to meet it's
>> > >> > >> > throughput/latency
>> > >> > >> > > > > > goals.
>> > >> > >> > > > > > > > >
>> > >> > >> > > > > > > > > Cheers,
>> > >> > >> > > > > > > > >
>> > >> > >> > > > > > > > > Roger
>> > >> > >> > > > > > > > >
>> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
>> > >> > >> jun@confluent.io>
>> > >> > >> > > > wrote:
>> > >> > >> > > > > > > > >
>> > >> > >> > > > > > > > > > Hi, Rajini,
>> > >> > >> > > > > > > > > >
>> > >> > >> > > > > > > > > > Thanks for the updated KIP. A few more
>> comments.
>> > >> > >> > > > > > > > > >
>> > >> > >> > > > > > > > > > 1. A concern of request_time_percent is that
>> it's
>> > >> not
>> > >> > an
>> > >> > >> > > > absolute
>> > >> > >> > > > > > > > value.
>> > >> > >> > > > > > > > > > Let's say you give a user a 10% limit. If the
>> > admin
>> > >> > >> doubles
>> > >> > >> > > the
>> > >> > >> > > > > > > number
>> > >> > >> > > > > > > > of
>> > >> > >> > > > > > > > > > request handler threads, that user now
>> actually
>> > has
>> > >> > >> twice
>> > >> > >> > the
>> > >> > >> > > > > > > absolute
>> > >> > >> > > > > > > > > > capacity. This may confuse people a bit. So,
>> > >> perhaps
>> > >> > >> > setting
>> > >> > >> > > > the
>> > >> > >> > > > > > > quota
>> > >> > >> > > > > > > > > > based on an absolute request thread unit is
>> > better.
>> > >> > >> > > > > > > > > >
>> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
>> > >> inter-broker
>> > >> > >> > request
>> > >> > >> > > > and
>> > >> > >> > > > > > > needs
>> > >> > >> > > > > > > > to
>> > >> > >> > > > > > > > > > be excluded from throttling.
>> > >> > >> > > > > > > > > >
>> > >> > >> > > > > > > > > > 3. Implementation wise, I am wondering if it's
>> > >> simpler
>> > >> > >> to
>> > >> > >> > > apply
>> > >> > >> > > > > the
>> > >> > >> > > > > > > > > request
>> > >> > >> > > > > > > > > > time throttling first in KafkaApis.handle().
>> > >> > Otherwise,
>> > >> > >> we
>> > >> > >> > > will
>> > >> > >> > > > > > need
>> > >> > >> > > > > > > to
>> > >> > >> > > > > > > > > add
>> > >> > >> > > > > > > > > > the throttling logic in each type of request.
>> > >> > >> > > > > > > > > >
>> > >> > >> > > > > > > > > > Thanks,
>> > >> > >> > > > > > > > > >
>> > >> > >> > > > > > > > > > Jun
>> > >> > >> > > > > > > > > >
>> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini
>> Sivaram <
>> > >> > >> > > > > > > > rajinisivaram@gmail.com
>> > >> > >> > > > > > > > > >
>> > >> > >> > > > > > > > > > wrote:
>> > >> > >> > > > > > > > > >
>> > >> > >> > > > > > > > > > > Jun,
>> > >> > >> > > > > > > > > > >
>> > >> > >> > > > > > > > > > > Thank you for the review.
>> > >> > >> > > > > > > > > > >
>> > >> > >> > > > > > > > > > > I have reverted to the original KIP that
>> > >> throttles
>> > >> > >> based
>> > >> > >> > on
>> > >> > >> > > > > > request
>> > >> > >> > > > > > > > > > handler
>> > >> > >> > > > > > > > > > > utilization. At the moment, it uses
>> percentage,
>> > >> but
>> > >> > I
>> > >> > >> am
>> > >> > >> > > > happy
>> > >> > >> > > > > to
>> > >> > >> > > > > > > > > change
>> > >> > >> > > > > > > > > > to
>> > >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
>> > >> required. I
>> > >> > >> have
>> > >> > >> > > > added
>> > >> > >> > > > > > the
>> > >> > >> > > > > > > > > > examples
>> > >> > >> > > > > > > > > > > from this discussion to the KIP. Also added
>> a
>> > >> > "Future
>> > >> > >> > Work"
>> > >> > >> > > > > > section
>> > >> > >> > > > > > > > to
>> > >> > >> > > > > > > > > > > address network thread utilization. The
>> > >> > configuration
>> > >> > >> is
>> > >> > >> > > > named
>> > >> > >> > > > > > > > > > > "request_time_percent" with the expectation
>> > that
>> > >> it
>> > >> > >> can
>> > >> > >> > > also
>> > >> > >> > > > be
>> > >> > >> > > > > > > used
>> > >> > >> > > > > > > > as
>> > >> > >> > > > > > > > > > the
>> > >> > >> > > > > > > > > > > limit for network thread utilization when
>> that
>> > is
>> > >> > >> > > > implemented,
>> > >> > >> > > > > so
>> > >> > >> > > > > > > > that
>> > >> > >> > > > > > > > > > > users have to set only one config for the
>> two
>> > and
>> > >> > not
>> > >> > >> > have
>> > >> > >> > > to
>> > >> > >> > > > > > worry
>> > >> > >> > > > > > > > > about
>> > >> > >> > > > > > > > > > > the internal distribution of the work
>> between
>> > the
>> > >> > two
>> > >> > >> > > thread
>> > >> > >> > > > > > pools
>> > >> > >> > > > > > > in
>> > >> > >> > > > > > > > > > > Kafka.
>> > >> > >> > > > > > > > > > >
>> > >> > >> > > > > > > > > > >
>> > >> > >> > > > > > > > > > > Regards,
>> > >> > >> > > > > > > > > > >
>> > >> > >> > > > > > > > > > > Rajini
>> > >> > >> > > > > > > > > > >
>> > >> > >> > > > > > > > > > >
>> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
>> > >> > >> > > jun@confluent.io>
>> > >> > >> > > > > > > wrote:
>> > >> > >> > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > Hi, Rajini,
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > Thanks for the proposal.
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > The benefit of using the request
>> processing
>> > >> time
>> > >> > >> over
>> > >> > >> > the
>> > >> > >> > > > > > request
>> > >> > >> > > > > > > > > rate
>> > >> > >> > > > > > > > > > is
>> > >> > >> > > > > > > > > > > > exactly what people have said. I will just
>> > >> expand
>> > >> > >> that
>> > >> > >> > a
>> > >> > >> > > > bit.
>> > >> > >> > > > > > > > > Consider
>> > >> > >> > > > > > > > > > > the
>> > >> > >> > > > > > > > > > > > following case. The producer sends a
>> produce
>> > >> > request
>> > >> > >> > > with a
>> > >> > >> > > > > > 10MB
>> > >> > >> > > > > > > > > > message
>> > >> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
>> > >> > >> decompression of
>> > >> > >> > > the
>> > >> > >> > > > > > > message
>> > >> > >> > > > > > > > > on
>> > >> > >> > > > > > > > > > > the
>> > >> > >> > > > > > > > > > > > broker could take 10-15 seconds, during
>> which
>> > >> > time,
>> > >> > >> a
>> > >> > >> > > > request
>> > >> > >> > > > > > > > handler
>> > >> > >> > > > > > > > > > > > thread is completely blocked. In this
>> case,
>> > >> > neither
>> > >> > >> the
>> > >> > >> > > > > byte-in
>> > >> > >> > > > > > > > quota
>> > >> > >> > > > > > > > > > nor
>> > >> > >> > > > > > > > > > > > the request rate quota may be effective in
>> > >> > >> protecting
>> > >> > >> > the
>> > >> > >> > > > > > broker.
>> > >> > >> > > > > > > > > > > Consider
>> > >> > >> > > > > > > > > > > > another case. A consumer group starts
>> with 10
>> > >> > >> instances
>> > >> > >> > > and
>> > >> > >> > > > > > later
>> > >> > >> > > > > > > > on
>> > >> > >> > > > > > > > > > > > switches to 20 instances. The request rate
>> > will
>> > >> > >> likely
>> > >> > >> > > > > double,
>> > >> > >> > > > > > > but
>> > >> > >> > > > > > > > > the
>> > >> > >> > > > > > > > > > > > actually load on the broker may not double
>> > >> since
>> > >> > >> each
>> > >> > >> > > fetch
>> > >> > >> > > > > > > request
>> > >> > >> > > > > > > > > > only
>> > >> > >> > > > > > > > > > > > contains half of the partitions. Request
>> rate
>> > >> > quota
>> > >> > >> may
>> > >> > >> > > not
>> > >> > >> > > > > be
>> > >> > >> > > > > > > easy
>> > >> > >> > > > > > > > > to
>> > >> > >> > > > > > > > > > > > configure in this case.
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > What we really want is to be able to
>> prevent
>> > a
>> > >> > >> client
>> > >> > >> > > from
>> > >> > >> > > > > > using
>> > >> > >> > > > > > > > too
>> > >> > >> > > > > > > > > > much
>> > >> > >> > > > > > > > > > > > of the server side resources. In this
>> > >> particular
>> > >> > >> KIP,
>> > >> > >> > > this
>> > >> > >> > > > > > > resource
>> > >> > >> > > > > > > > > is
>> > >> > >> > > > > > > > > > > the
>> > >> > >> > > > > > > > > > > > capacity of the request handler threads. I
>> > >> agree
>> > >> > >> that
>> > >> > >> > it
>> > >> > >> > > > may
>> > >> > >> > > > > > not
>> > >> > >> > > > > > > be
>> > >> > >> > > > > > > > > > > > intuitive for the users to determine how
>> to
>> > set
>> > >> > the
>> > >> > >> > right
>> > >> > >> > > > > > limit.
>> > >> > >> > > > > > > > > > However,
>> > >> > >> > > > > > > > > > > > this is not completely new and has been
>> done
>> > in
>> > >> > the
>> > >> > >> > > > container
>> > >> > >> > > > > > > world
>> > >> > >> > > > > > > > > > > > already. For example, Linux cgroup (
>> > >> > >> > > > > https://access.redhat.com/
>> > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
>> > >> > >> terprise_Linux/6/html/
>> > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html)
>> has
>> > >> the
>> > >> > >> > concept
>> > >> > >> > > of
>> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
>> > >> > >> > > > > > > > > > > > which specifies the total amount of time
>> in
>> > >> > >> > microseconds
>> > >> > >> > > > for
>> > >> > >> > > > > > > which
>> > >> > >> > > > > > > > > all
>> > >> > >> > > > > > > > > > > > tasks in a cgroup can run during a one
>> second
>> > >> > >> period.
>> > >> > >> > We
>> > >> > >> > > > can
>> > >> > >> > > > > > > > > > potentially
>> > >> > >> > > > > > > > > > > > model the request handler threads in a
>> > similar
>> > >> > way.
>> > >> > >> For
>> > >> > >> > > > > > example,
>> > >> > >> > > > > > > > each
>> > >> > >> > > > > > > > > > > > request handler thread can be 1 request
>> > handler
>> > >> > unit
>> > >> > >> > and
>> > >> > >> > > > the
>> > >> > >> > > > > > > admin
>> > >> > >> > > > > > > > > can
>> > >> > >> > > > > > > > > > > > configure a limit on how many units (say
>> > 0.01)
>> > >> a
>> > >> > >> client
>> > >> > >> > > can
>> > >> > >> > > > > > have.
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > Regarding not throttling the internal
>> broker
>> > to
>> > >> > >> broker
>> > >> > >> > > > > > requests.
>> > >> > >> > > > > > > We
>> > >> > >> > > > > > > > > > could
>> > >> > >> > > > > > > > > > > > do that. Alternatively, we could just let
>> the
>> > >> > admin
>> > >> > >> > > > > configure a
>> > >> > >> > > > > > > > high
>> > >> > >> > > > > > > > > > > limit
>> > >> > >> > > > > > > > > > > > for the kafka user (it may not be able to
>> do
>> > >> that
>> > >> > >> > easily
>> > >> > >> > > > > based
>> > >> > >> > > > > > on
>> > >> > >> > > > > > > > > > > clientId
>> > >> > >> > > > > > > > > > > > though).
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > Ideally we want to be able to protect the
>> > >> > >> utilization
>> > >> > >> > of
>> > >> > >> > > > the
>> > >> > >> > > > > > > > network
>> > >> > >> > > > > > > > > > > thread
>> > >> > >> > > > > > > > > > > > pool too. The difficult is mostly what
>> Rajini
>> > >> > said:
>> > >> > >> (1)
>> > >> > >> > > The
>> > >> > >> > > > > > > > mechanism
>> > >> > >> > > > > > > > > > for
>> > >> > >> > > > > > > > > > > > throttling the requests is through
>> Purgatory
>> > >> and
>> > >> > we
>> > >> > >> > will
>> > >> > >> > > > have
>> > >> > >> > > > > > to
>> > >> > >> > > > > > > > > think
>> > >> > >> > > > > > > > > > > > through how to integrate that into the
>> > network
>> > >> > >> layer.
>> > >> > >> > > (2)
>> > >> > >> > > > In
>> > >> > >> > > > > > the
>> > >> > >> > > > > > > > > > network
>> > >> > >> > > > > > > > > > > > layer, currently we know the user, but not
>> > the
>> > >> > >> clientId
>> > >> > >> > > of
>> > >> > >> > > > > the
>> > >> > >> > > > > > > > > request.
>> > >> > >> > > > > > > > > > > So,
>> > >> > >> > > > > > > > > > > > it's a bit tricky to throttle based on
>> > clientId
>> > >> > >> there.
>> > >> > >> > > > Plus,
>> > >> > >> > > > > > the
>> > >> > >> > > > > > > > > > byteOut
>> > >> > >> > > > > > > > > > > > quota can already protect the network
>> thread
>> > >> > >> > utilization
>> > >> > >> > > > for
>> > >> > >> > > > > > > fetch
>> > >> > >> > > > > > > > > > > > requests. So, if we can't figure out this
>> > part
>> > >> > right
>> > >> > >> > now,
>> > >> > >> > > > > just
>> > >> > >> > > > > > > > > focusing
>> > >> > >> > > > > > > > > > > on
>> > >> > >> > > > > > > > > > > > the request handling threads for this KIP
>> is
>> > >> > still a
>> > >> > >> > > useful
>> > >> > >> > > > > > > > feature.
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > Thanks,
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > Jun
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini
>> > >> Sivaram <
>> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > wrote:
>> > >> > >> > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > Thank you all for the feedback.
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > Jay: I have removed exemption for
>> consumer
>> > >> > >> heartbeat
>> > >> > >> > > etc.
>> > >> > >> > > > > > Agree
>> > >> > >> > > > > > > > > that
>> > >> > >> > > > > > > > > > > > > protecting the cluster is more important
>> > than
>> > >> > >> > > protecting
>> > >> > >> > > > > > > > individual
>> > >> > >> > > > > > > > > > > apps.
>> > >> > >> > > > > > > > > > > > > Have retained the exemption for
>> > >> > >> > > StopReplicat/LeaderAndIsr
>> > >> > >> > > > > > etc,
>> > >> > >> > > > > > > > > these
>> > >> > >> > > > > > > > > > > are
>> > >> > >> > > > > > > > > > > > > throttled only if authorization fails
>> (so
>> > >> can't
>> > >> > be
>> > >> > >> > used
>> > >> > >> > > > for
>> > >> > >> > > > > > DoS
>> > >> > >> > > > > > > > > > attacks
>> > >> > >> > > > > > > > > > > > in
>> > >> > >> > > > > > > > > > > > > a secure cluster, but allows
>> inter-broker
>> > >> > >> requests to
>> > >> > >> > > > > > complete
>> > >> > >> > > > > > > > > > without
>> > >> > >> > > > > > > > > > > > > delays).
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > I will wait another day to see if these
>> is
>> > >> any
>> > >> > >> > > objection
>> > >> > >> > > > to
>> > >> > >> > > > > > > > quotas
>> > >> > >> > > > > > > > > > > based
>> > >> > >> > > > > > > > > > > > on
>> > >> > >> > > > > > > > > > > > > request processing time (as opposed to
>> > >> request
>> > >> > >> rate)
>> > >> > >> > > and
>> > >> > >> > > > if
>> > >> > >> > > > > > > there
>> > >> > >> > > > > > > > > are
>> > >> > >> > > > > > > > > > > no
>> > >> > >> > > > > > > > > > > > > objections, I will revert to the
>> original
>> > >> > proposal
>> > >> > >> > with
>> > >> > >> > > > > some
>> > >> > >> > > > > > > > > changes.
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > The original proposal was only including
>> > the
>> > >> > time
>> > >> > >> > used
>> > >> > >> > > by
>> > >> > >> > > > > the
>> > >> > >> > > > > > > > > request
>> > >> > >> > > > > > > > > > > > > handler threads (that made calculation
>> > >> easy). I
>> > >> > >> think
>> > >> > >> > > the
>> > >> > >> > > > > > > > > suggestion
>> > >> > >> > > > > > > > > > is
>> > >> > >> > > > > > > > > > > > to
>> > >> > >> > > > > > > > > > > > > include the time spent in the network
>> > >> threads as
>> > >> > >> well
>> > >> > >> > > > since
>> > >> > >> > > > > > > that
>> > >> > >> > > > > > > > > may
>> > >> > >> > > > > > > > > > be
>> > >> > >> > > > > > > > > > > > > significant. As Jay pointed out, it is
>> more
>> > >> > >> > complicated
>> > >> > >> > > > to
>> > >> > >> > > > > > > > > calculate
>> > >> > >> > > > > > > > > > > the
>> > >> > >> > > > > > > > > > > > > total available CPU time and convert to
>> a
>> > >> ratio
>> > >> > >> when
>> > >> > >> > > > there
>> > >> > >> > > > > > *m*
>> > >> > >> > > > > > > > I/O
>> > >> > >> > > > > > > > > > > > threads
>> > >> > >> > > > > > > > > > > > > and *n* network threads.
>> > >> > >> > ThreadMXBean#getThreadCPUTime(
>> > >> > >> > > )
>> > >> > >> > > > > may
>> > >> > >> > > > > > > > give
>> > >> > >> > > > > > > > > us
>> > >> > >> > > > > > > > > > > > what
>> > >> > >> > > > > > > > > > > > > we want, but it can be very expensive on
>> > some
>> > >> > >> > > platforms.
>> > >> > >> > > > As
>> > >> > >> > > > > > > > Becket
>> > >> > >> > > > > > > > > > and
>> > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have
>> > several
>> > >> > time
>> > >> > >> > > > > > measurements
>> > >> > >> > > > > > > > > > already
>> > >> > >> > > > > > > > > > > > for
>> > >> > >> > > > > > > > > > > > > generating metrics that we could use,
>> > though
>> > >> we
>> > >> > >> might
>> > >> > >> > > > want
>> > >> > >> > > > > to
>> > >> > >> > > > > > > > > switch
>> > >> > >> > > > > > > > > > to
>> > >> > >> > > > > > > > > > > > > nanoTime() instead of
>> currentTimeMillis()
>> > >> since
>> > >> > >> some
>> > >> > >> > of
>> > >> > >> > > > the
>> > >> > >> > > > > > > > values
>> > >> > >> > > > > > > > > > for
>> > >> > >> > > > > > > > > > > > > small requests may be < 1ms. But rather
>> > than
>> > >> add
>> > >> > >> up
>> > >> > >> > the
>> > >> > >> > > > > time
>> > >> > >> > > > > > > > spent
>> > >> > >> > > > > > > > > in
>> > >> > >> > > > > > > > > > > I/O
>> > >> > >> > > > > > > > > > > > > thread and network thread, wouldn't it
>> be
>> > >> better
>> > >> > >> to
>> > >> > >> > > > convert
>> > >> > >> > > > > > the
>> > >> > >> > > > > > > > > time
>> > >> > >> > > > > > > > > > > > spent
>> > >> > >> > > > > > > > > > > > > on each thread into a separate ratio?
>> UserA
>> > >> has
>> > >> > a
>> > >> > >> > > request
>> > >> > >> > > > > > quota
>> > >> > >> > > > > > > > of
>> > >> > >> > > > > > > > > > 5%.
>> > >> > >> > > > > > > > > > > > Can
>> > >> > >> > > > > > > > > > > > > we take that to mean that UserA can use
>> 5%
>> > of
>> > >> > the
>> > >> > >> > time
>> > >> > >> > > on
>> > >> > >> > > > > > > network
>> > >> > >> > > > > > > > > > > threads
>> > >> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If
>> > either
>> > >> is
>> > >> > >> > > exceeded,
>> > >> > >> > > > > the
>> > >> > >> > > > > > > > > > response
>> > >> > >> > > > > > > > > > > is
>> > >> > >> > > > > > > > > > > > > throttled - it would mean maintaining
>> two
>> > >> sets
>> > >> > of
>> > >> > >> > > metrics
>> > >> > >> > > > > for
>> > >> > >> > > > > > > the
>> > >> > >> > > > > > > > > two
>> > >> > >> > > > > > > > > > > > > durations, but would result in more
>> > >> meaningful
>> > >> > >> > ratios.
>> > >> > >> > > We
>> > >> > >> > > > > > could
>> > >> > >> > > > > > > > > > define
>> > >> > >> > > > > > > > > > > > two
>> > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request
>> > threads
>> > >> > and
>> > >> > >> 10%
>> > >> > >> > > of
>> > >> > >> > > > > > > network
>> > >> > >> > > > > > > > > > > > threads),
>> > >> > >> > > > > > > > > > > > > but that seems unnecessary and harder to
>> > >> explain
>> > >> > >> to
>> > >> > >> > > > users.
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > Back to why and how quotas are applied
>> to
>> > >> > network
>> > >> > >> > > thread
>> > >> > >> > > > > > > > > utilization:
>> > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the time
>> spent in
>> > >> the
>> > >> > >> > network
>> > >> > >> > > > > > thread
>> > >> > >> > > > > > > > may
>> > >> > >> > > > > > > > > be
>> > >> > >> > > > > > > > > > > > > significant and I can see the need to
>> > include
>> > >> > >> this.
>> > >> > >> > Are
>> > >> > >> > > > > there
>> > >> > >> > > > > > > > other
>> > >> > >> > > > > > > > > > > > > requests where the network thread
>> > >> utilization is
>> > >> > >> > > > > significant?
>> > >> > >> > > > > > > In
>> > >> > >> > > > > > > > > the
>> > >> > >> > > > > > > > > > > case
>> > >> > >> > > > > > > > > > > > > of fetch, request handler thread
>> > utilization
>> > >> > would
>> > >> > >> > > > throttle
>> > >> > >> > > > > > > > clients
>> > >> > >> > > > > > > > > > > with
>> > >> > >> > > > > > > > > > > > > high request rate, low data volume and
>> > fetch
>> > >> > byte
>> > >> > >> > rate
>> > >> > >> > > > > quota
>> > >> > >> > > > > > > will
>> > >> > >> > > > > > > > > > > > throttle
>> > >> > >> > > > > > > > > > > > > clients with high data volume. Network
>> > thread
>> > >> > >> > > utilization
>> > >> > >> > > > > is
>> > >> > >> > > > > > > > > perhaps
>> > >> > >> > > > > > > > > > > > > proportional to the data volume. I am
>> > >> wondering
>> > >> > >> if we
>> > >> > >> > > > even
>> > >> > >> > > > > > need
>> > >> > >> > > > > > > > to
>> > >> > >> > > > > > > > > > > > throttle
>> > >> > >> > > > > > > > > > > > > based on network thread utilization or
>> > >> whether
>> > >> > the
>> > >> > >> > data
>> > >> > >> > > > > > volume
>> > >> > >> > > > > > > > > quota
>> > >> > >> > > > > > > > > > > > covers
>> > >> > >> > > > > > > > > > > > > this case.
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > b) At the moment, we record and check
>> for
>> > >> quota
>> > >> > >> > > violation
>> > >> > >> > > > > at
>> > >> > >> > > > > > > the
>> > >> > >> > > > > > > > > same
>> > >> > >> > > > > > > > > > > > time.
>> > >> > >> > > > > > > > > > > > > If a quota is violated, the response is
>> > >> delayed.
>> > >> > >> > Using
>> > >> > >> > > > > Jay'e
>> > >> > >> > > > > > > > > example
>> > >> > >> > > > > > > > > > of
>> > >> > >> > > > > > > > > > > > > disk reads for fetches happening in the
>> > >> network
>> > >> > >> > thread,
>> > >> > >> > > > We
>> > >> > >> > > > > > > can't
>> > >> > >> > > > > > > > > > record
>> > >> > >> > > > > > > > > > > > and
>> > >> > >> > > > > > > > > > > > > delay a response after the disk reads.
>> We
>> > >> could
>> > >> > >> > record
>> > >> > >> > > > the
>> > >> > >> > > > > > time
>> > >> > >> > > > > > > > > spent
>> > >> > >> > > > > > > > > > > on
>> > >> > >> > > > > > > > > > > > > the network thread when the response is
>> > >> complete
>> > >> > >> and
>> > >> > >> > > > > > introduce
>> > >> > >> > > > > > > a
>> > >> > >> > > > > > > > > > delay
>> > >> > >> > > > > > > > > > > > for
>> > >> > >> > > > > > > > > > > > > handling a subsequent request (separate
>> out
>> > >> > >> recording
>> > >> > >> > > and
>> > >> > >> > > > > > quota
>> > >> > >> > > > > > > > > > > violation
>> > >> > >> > > > > > > > > > > > > handling in the case of network thread
>> > >> > overload).
>> > >> > >> > Does
>> > >> > >> > > > that
>> > >> > >> > > > > > > make
>> > >> > >> > > > > > > > > > sense?
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > Regards,
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > Rajini
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket
>> > Qin <
>> > >> > >> > > > > > > > becket.qin@gmail.com>
>> > >> > >> > > > > > > > > > > > wrote:
>> > >> > >> > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > > Hey Jay,
>> > >> > >> > > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU
>> time
>> > >> is a
>> > >> > >> > little
>> > >> > >> > > > > > > tricky. I
>> > >> > >> > > > > > > > > am
>> > >> > >> > > > > > > > > > > > > thinking
>> > >> > >> > > > > > > > > > > > > > that maybe we can use the existing
>> > request
>> > >> > >> > > statistics.
>> > >> > >> > > > > They
>> > >> > >> > > > > > > are
>> > >> > >> > > > > > > > > > > already
>> > >> > >> > > > > > > > > > > > > > very detailed so we can probably see
>> the
>> > >> > >> > approximate
>> > >> > >> > > > CPU
>> > >> > >> > > > > > time
>> > >> > >> > > > > > > > > from
>> > >> > >> > > > > > > > > > > it,
>> > >> > >> > > > > > > > > > > > > e.g.
>> > >> > >> > > > > > > > > > > > > > something like (total_time -
>> > >> > >> > > > request/response_queue_time
>> > >> > >> > > > > -
>> > >> > >> > > > > > > > > > > > remote_time).
>> > >> > >> > > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > > I agree with Guozhang that when a
>> user is
>> > >> > >> throttled
>> > >> > >> > > it
>> > >> > >> > > > is
>> > >> > >> > > > > > > > likely
>> > >> > >> > > > > > > > > > that
>> > >> > >> > > > > > > > > > > > we
>> > >> > >> > > > > > > > > > > > > > need to see if anything has went wrong
>> > >> first,
>> > >> > >> and
>> > >> > >> > if
>> > >> > >> > > > the
>> > >> > >> > > > > > > users
>> > >> > >> > > > > > > > > are
>> > >> > >> > > > > > > > > > > well
>> > >> > >> > > > > > > > > > > > > > behaving and just need more
>> resources, we
>> > >> will
>> > >> > >> have
>> > >> > >> > > to
>> > >> > >> > > > > bump
>> > >> > >> > > > > > > up
>> > >> > >> > > > > > > > > the
>> > >> > >> > > > > > > > > > > > quota
>> > >> > >> > > > > > > > > > > > > > for them. It is true that
>> pre-allocating
>> > >> CPU
>> > >> > >> time
>> > >> > >> > > quota
>> > >> > >> > > > > > > > precisely
>> > >> > >> > > > > > > > > > for
>> > >> > >> > > > > > > > > > > > the
>> > >> > >> > > > > > > > > > > > > > users is difficult. So in practice it
>> > would
>> > >> > >> > probably
>> > >> > >> > > be
>> > >> > >> > > > > > more
>> > >> > >> > > > > > > > like
>> > >> > >> > > > > > > > > > > first
>> > >> > >> > > > > > > > > > > > > set
>> > >> > >> > > > > > > > > > > > > > a relative high protective CPU time
>> quota
>> > >> for
>> > >> > >> > > everyone
>> > >> > >> > > > > and
>> > >> > >> > > > > > > > > increase
>> > >> > >> > > > > > > > > > > > that
>> > >> > >> > > > > > > > > > > > > > for some individual clients on demand.
>> > >> > >> > > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > > Thanks,
>> > >> > >> > > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
>> > >> > >> > > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM,
>> Guozhang
>> > >> > Wang <
>> > >> > >> > > > > > > > > wangguoz@gmail.com
>> > >> > >> > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > wrote:
>> > >> > >> > > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > > > This is a great proposal, glad to
>> see
>> > it
>> > >> > >> > happening.
>> > >> > >> > > > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU
>> throttling, or
>> > >> more
>> > >> > >> > > > > specifically
>> > >> > >> > > > > > > > > > > processing
>> > >> > >> > > > > > > > > > > > > time
>> > >> > >> > > > > > > > > > > > > > > ratio instead of the request rate
>> > >> throttling
>> > >> > >> as
>> > >> > >> > > well.
>> > >> > >> > > > > > > Becket
>> > >> > >> > > > > > > > > has
>> > >> > >> > > > > > > > > > > very
>> > >> > >> > > > > > > > > > > > > > well
>> > >> > >> > > > > > > > > > > > > > > summed my rationales above, and one
>> > >> thing to
>> > >> > >> add
>> > >> > >> > > here
>> > >> > >> > > > > is
>> > >> > >> > > > > > > that
>> > >> > >> > > > > > > > > the
>> > >> > >> > > > > > > > > > > > > former
>> > >> > >> > > > > > > > > > > > > > > has a good support for both
>> "protecting
>> > >> > >> against
>> > >> > >> > > rogue
>> > >> > >> > > > > > > > clients"
>> > >> > >> > > > > > > > > as
>> > >> > >> > > > > > > > > > > > well
>> > >> > >> > > > > > > > > > > > > as
>> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
>> multi-tenancy
>> > >> > usage":
>> > >> > >> > when
>> > >> > >> > > > > > > thinking
>> > >> > >> > > > > > > > > > about
>> > >> > >> > > > > > > > > > > > how
>> > >> > >> > > > > > > > > > > > > to
>> > >> > >> > > > > > > > > > > > > > > explain this to the end users, I
>> find
>> > it
>> > >> > >> actually
>> > >> > >> > > > more
>> > >> > >> > > > > > > > natural
>> > >> > >> > > > > > > > > > than
>> > >> > >> > > > > > > > > > > > the
>> > >> > >> > > > > > > > > > > > > > > request rate since as mentioned
>> above,
>> > >> > >> different
>> > >> > >> > > > > requests
>> > >> > >> > > > > > > > will
>> > >> > >> > > > > > > > > > have
>> > >> > >> > > > > > > > > > > > > quite
>> > >> > >> > > > > > > > > > > > > > > different "cost", and Kafka today
>> > already
>> > >> > have
>> > >> > >> > > > various
>> > >> > >> > > > > > > > request
>> > >> > >> > > > > > > > > > > types
>> > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata,
>> etc),
>> > >> > >> because
>> > >> > >> > of
>> > >> > >> > > > that
>> > >> > >> > > > > > the
>> > >> > >> > > > > > > > > > request
>> > >> > >> > > > > > > > > > > > > rate
>> > >> > >> > > > > > > > > > > > > > > throttling may not be as effective
>> > >> unless it
>> > >> > >> is
>> > >> > >> > set
>> > >> > >> > > > > very
>> > >> > >> > > > > > > > > > > > > conservatively.
>> > >> > >> > > > > > > > > > > > > > >
>> > >> > >> > > > > > > > > > > > > > > Regarding to user reactions when
>> they
>> > are
>> > >> > >> > > throttled,
>> > >> > >> > > > I
>> > >> > >> > > > > > > think
>> > >> > >> > > > > > > > it
>> > >> > >> > > > > > > > > > may
>> > >> > >> > > > > > > > > > > > > > differ
>> > >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
>> > discovered /
>> > >> > >> guided
>> > >> > >> > by
>> > >> > >> > > > > > looking
>> > >> > >> > > > > > > > at
>> > >> > >> > > > > > > > > > > > relative
>> > >> > >> > > > > > > > > > > > > > > metrics. So in other words users
>> would
>> > >> not
>> > >> > >> expect
>> > >> > >> > > to
>> > >> > >> > > > > get
>> > >> > >> > > > > > > > > > additional
>> > >> > >> > > > > > > > > > > > > > > information by simply being told
>> "hey,
>> > >> you
>> > >> > are
>> > >> > >> > > > > > throttled",
>> > >> > >> > > > > > > > > which
>> > >> > >> > > > > > > > > > is
>> > >> > >> > > > > > > > > > > > all
>> > >> > >> > > > > > > > > > > > > > > what throttling does; they need to
>> > take a
>> > >> > >> > follow-up
>> > >> > >> > > > > step
>> > >> > >> > > > > > > and
>> > >> > >> > > > > > > > > see
>> > >> > >> > > > > > > > > > > > "hmm,
>> > >> > >> > > > > > > > > > > > > > I'm
>> > >> > >> > > > > > > > > > > > > > > throttled probably because of ..",
>> > which
>> > >> is
>> > >> > by
>> > >> > >> > > > looking
>> > >> > >> > > > > at
>> > >> > >> > > > > > > > other
>> > >> > >> > > > > > > > > > > > metric
>> > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding
>> the
>> > >> > >> brokers
>> > >> > >> > > with
>> > >> > >> > > > >
>>
> ...
>
> [Message clipped]

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Jay,

2. Regarding request.unit vs request.percentage. I started with
request.percentage too. The reasoning for request.unit is the following.
Suppose that the capacity has been reached on a broker and the admin needs
to add a new user. A simple way to increase the capacity is to increase the
number of io threads, assuming there are still enough cores. If the limit
is based on percentage, the additional capacity automatically gets
distributed to existing users and we haven't really carved out any
additional resource for the new user. Now, is it easy for a user to reason
about 0.1 unit vs 10%. My feeling is that both are hard and have to be
configured empirically. Not sure if percentage is obviously easier to
reason about.

Thanks,

Jun

On Fri, Feb 24, 2017 at 8:10 AM, Jay Kreps <ja...@confluent.io> wrote:

> A couple of quick points:
>
> 1. Even though the implementation of this quota is only using io thread
> time, i think we should call it something like "request-time". This will
> give us flexibility to improve the implementation to cover network threads
> in the future and will avoid exposing internal details like our thread
> pools on the server.
>
> 2. Jun/Roger, I get what you are trying to fix but the idea of thread/units
> is super unintuitive as a user-facing knob. I had to read the KIP like
> eight times to understand this. I'm not sure that your point that
> increasing the number of threads is a problem with a percentage-based
> value, it really depends on whether the user thinks about the "percentage
> of request processing time" or "thread units". If they think "I have
> allocated 10% of my request processing time to user x" then it is a bug
> that increasing the thread count decreases that percent as it does in the
> current proposal. As a practical matter I think the only way to actually
> reason about this is as a percent---I just don't believe people are going
> to think, "ah, 4.3 thread units, that is the right amount!". Instead I
> think they have to understand this thread unit concept, figure out what
> they have set in number of threads, compute a percent and then come up with
> the number of thread units, and these will all be wrong if that thread
> count changes. I also think this ties us to throttling the I/O thread pool,
> which may not be where we want to end up.
>
> 3. For what it's worth I do think having a single throttle_ms field in all
> the responses that combines all throttling from all quotas is probably the
> simplest. There could be a use case for having separate fields for each,
> but I think that is actually harder to use/monitor in the common case so
> unless someone has a use case I think just one should be fine.
>
> -Jay
>
> On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > I have updated the KIP based on the discussions so far.
> >
> >
> > Regards,
> >
> > Rajini
> >
> > On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <
> rajinisivaram@gmail.com>
> > wrote:
> >
> > > Thank you all for the feedback.
> > >
> > > Ismael #1. It makes sense not to throttle inter-broker requests like
> > > LeaderAndIsr etc. The simplest way to ensure that clients cannot use
> > these
> > > requests to bypass quotas for DoS attacks is to ensure that ACLs
> prevent
> > > clients from using these requests and unauthorized requests are
> included
> > > towards quotas.
> > >
> > > Ismael #2, Jay #1 : I was thinking that these quotas can return a
> > separate
> > > throttle time, and all utilization based quotas could use the same
> field
> > > (we won't add another one for network thread utilization for instance).
> > But
> > > perhaps it makes sense to keep byte rate quotas separate in
> produce/fetch
> > > responses to provide separate metrics? Agree with Ismael that the name
> of
> > > the existing field should be changed if we have two. Happy to switch
> to a
> > > single combined throttle time if that is sufficient.
> > >
> > > Ismael #4, #5, #6: Will update KIP. Will use dot separated name for new
> > > property. Replication quotas use dot separated, so it will be
> consistent
> > > with all properties except byte rate quotas.
> > >
> > > Radai: #1 Request processing time rather than request rate were chosen
> > > because the time per request can vary significantly between requests as
> > > mentioned in the discussion and KIP.
> > > #2 Two separate quotas for heartbeats/regular requests feel like more
> > > configuration and more metrics. Since most users would set quotas
> higher
> > > than the expected usage and quotas are more of a safety net, a single
> > quota
> > > should work in most cases.
> > >  #3 The number of requests in purgatory is limited by the number of
> > active
> > > connections since only one request per connection will be throttled at
> a
> > > time.
> > > #4 As with byte rate quotas, to use the full allocated quotas,
> > > clients/users would need to use partitions that are distributed across
> > the
> > > cluster. The alternative of using cluster-wide quotas instead of
> > per-broker
> > > quotas would be far too complex to implement.
> > >
> > > Dong : We currently have two ClientQuotaManagers for quota types Fetch
> > and
> > > Produce. A new one will be added for IOThread, which manages quotas for
> > I/O
> > > thread utilization. This will not update the Fetch or Produce
> queue-size,
> > > but will have a separate metric for the queue-size.  I wasn't planning
> to
> > > add any additional metrics apart from the equivalent ones for existing
> > > quotas as part of this KIP. Ratio of byte-rate to I/O thread
> utilization
> > > could be slightly misleading since it depends on the sequence of
> > requests.
> > > But we can look into more metrics after the KIP is implemented if
> > required.
> > >
> > > I think we need to limit the maximum delay since all requests are
> > > throttled. If a client has a quota of 0.001 units and a single request
> > used
> > > 50ms, we don't want to delay all requests from the client by 50
> seconds,
> > > throwing the client out of all its consumer groups. The issue is only
> if
> > a
> > > user is allocated a quota that is insufficient to process one large
> > > request. The expectation is that the units allocated per user will be
> > much
> > > higher than the time taken to process one request and the limit should
> > > seldom be applied. Agree this needs proper documentation.
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > > On Thu, Feb 23, 2017 at 8:04 PM, radai <ra...@gmail.com>
> > wrote:
> > >
> > >> @jun: i wasnt concerned about tying up a request processing thread,
> but
> > >> IIUC the code does still read the entire request out, which might
> add-up
> > >> to
> > >> a non-negligible amount of memory.
> > >>
> > >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com>
> wrote:
> > >>
> > >> > Hey Rajini,
> > >> >
> > >> > The current KIP says that the maximum delay will be reduced to
> window
> > >> size
> > >> > if it is larger than the window size. I have a concern with this:
> > >> >
> > >> > 1) This essentially means that the user is allowed to exceed their
> > quota
> > >> > over a long period of time. Can you provide an upper bound on this
> > >> > deviation?
> > >> >
> > >> > 2) What is the motivation for cap the maximum delay by the window
> > size?
> > >> I
> > >> > am wondering if there is better alternative to address the problem.
> > >> >
> > >> > 3) It means that the existing metric-related config will have a more
> > >> > directly impact on the mechanism of this io-thread-unit-based quota.
> > The
> > >> > may be an important change depending on the answer to 1) above. We
> > >> probably
> > >> > need to document this more explicitly.
> > >> >
> > >> > Dong
> > >> >
> > >> >
> > >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <li...@gmail.com>
> > wrote:
> > >> >
> > >> > > Hey Jun,
> > >> > >
> > >> > > Yeah you are right. I thought it wasn't because at LinkedIn it
> will
> > be
> > >> > too
> > >> > > much pressure on inGraph to expose those per-clientId metrics so
> we
> > >> ended
> > >> > > up printing them periodically to local log. Never mind if it is
> not
> > a
> > >> > > general problem.
> > >> > >
> > >> > > Hey Rajini,
> > >> > >
> > >> > > - I agree with Jay that we probably don't want to add a new field
> > for
> > >> > > every quota ProduceResponse or FetchResponse. Is there any
> use-case
> > >> for
> > >> > > having separate throttle-time fields for byte-rate-quota and
> > >> > > io-thread-unit-quota? You probably need to document this as
> > interface
> > >> > > change if you plan to add new field in any request.
> > >> > >
> > >> > > - I don't think IOThread belongs to quotaType. The existing quota
> > >> types
> > >> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication)
> identify
> > >> the
> > >> > > type of request that are throttled, not the quota mechanism that
> is
> > >> > applied.
> > >> > >
> > >> > > - If a request is throttled due to this io-thread-unit-based
> quota,
> > is
> > >> > the
> > >> > > existing queue-size metric in ClientQuotaManager incremented?
> > >> > >
> > >> > > - In the interest of providing guide line for admin to decide
> > >> > > io-thread-unit-based quota and for user to understand its impact
> on
> > >> their
> > >> > > traffic, would it be useful to have a metric that shows the
> overall
> > >> > > byte-rate per io-thread-unit? Can we also show this a per-clientId
> > >> > metric?
> > >> > >
> > >> > > Thanks,
> > >> > > Dong
> > >> > >
> > >> > >
> > >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io>
> wrote:
> > >> > >
> > >> > >> Hi, Ismael,
> > >> > >>
> > >> > >> For #3, typically, an admin won't configure more io threads than
> > CPU
> > >> > >> cores,
> > >> > >> but it's possible for an admin to start with fewer io threads
> than
> > >> cores
> > >> > >> and grow that later on.
> > >> > >>
> > >> > >> Hi, Dong,
> > >> > >>
> > >> > >> I think the throttleTime sensor on the broker tells the admin
> > >> whether a
> > >> > >> user/clentId is throttled or not.
> > >> > >>
> > >> > >> Hi, Radi,
> > >> > >>
> > >> > >> The reasoning for delaying the throttled requests on the broker
> > >> instead
> > >> > of
> > >> > >> returning an error immediately is that the latter has no way to
> > >> prevent
> > >> > >> the
> > >> > >> client from retrying immediately, which will make things worse.
> The
> > >> > >> delaying logic is based off a delay queue. A separate expiration
> > >> thread
> > >> > >> just waits on the next to be expired request. So, it doesn't tie
> > up a
> > >> > >> request handler thread.
> > >> > >>
> > >> > >> Thanks,
> > >> > >>
> > >> > >> Jun
> > >> > >>
> > >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <is...@juma.me.uk>
> > >> wrote:
> > >> > >>
> > >> > >> > Hi Jay,
> > >> > >> >
> > >> > >> > Regarding 1, I definitely like the simplicity of keeping a
> single
> > >> > >> throttle
> > >> > >> > time field in the response. The downside is that the client
> > metrics
> > >> > >> will be
> > >> > >> > more coarse grained.
> > >> > >> >
> > >> > >> > Regarding 3, we have `leader.imbalance.per.broker.percentage`
> > and
> > >> > >> > `log.cleaner.min.cleanable.ratio`.
> > >> > >> >
> > >> > >> > Ismael
> > >> > >> >
> > >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io>
> > >> wrote:
> > >> > >> >
> > >> > >> > > A few minor comments:
> > >> > >> > >
> > >> > >> > >    1. Isn't it the case that the throttling time response
> field
> > >> > should
> > >> > >> > have
> > >> > >> > >    the total time your request was throttled irrespective of
> > the
> > >> > >> quotas
> > >> > >> > > that
> > >> > >> > >    caused that. Limiting it to byte rate quota doesn't make
> > >> sense,
> > >> > >> but I
> > >> > >> > > also
> > >> > >> > >    I don't think we want to end up adding new fields in the
> > >> response
> > >> > >> for
> > >> > >> > > every
> > >> > >> > >    single thing we quota, right?
> > >> > >> > >    2. I don't think we should make this quota specifically
> > about
> > >> io
> > >> > >> > >    threads. Once we introduce these quotas people set them
> and
> > >> > expect
> > >> > >> > them
> > >> > >> > > to
> > >> > >> > >    be enforced (and if they aren't it may cause an outage).
> As
> > a
> > >> > >> result
> > >> > >> > > they
> > >> > >> > >    are a bit more sensitive than normal configs, I think. The
> > >> > current
> > >> > >> > > thread
> > >> > >> > >    pools seem like something of an implementation detail and
> > not
> > >> the
> > >> > >> > level
> > >> > >> > > the
> > >> > >> > >    user-facing quotas should be involved with. I think it
> might
> > >> be
> > >> > >> better
> > >> > >> > > to
> > >> > >> > >    make this a general request-time throttle with no mention
> in
> > >> the
> > >> > >> > naming
> > >> > >> > >    about I/O threads and simply acknowledge the current
> > >> limitation
> > >> > >> (which
> > >> > >> > > we
> > >> > >> > >    may someday fix) in the docs that this covers only the
> time
> > >> after
> > >> > >> the
> > >> > >> > >    thread is read off the network.
> > >> > >> > >    3. As such I think the right interface to the user would
> be
> > >> > >> something
> > >> > >> > >    like percent_request_time and be in {0,...100} or
> > >> > >> request_time_ratio
> > >> > >> > > and be
> > >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we
> used
> > >> if
> > >> > the
> > >> > >> > > scale
> > >> > >> > >    is between 0 and 1 in the other metrics, right?)
> > >> > >> > >
> > >> > >> > > -Jay
> > >> > >> > >
> > >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > >> > >> rajinisivaram@gmail.com
> > >> > >> > >
> > >> > >> > > wrote:
> > >> > >> > >
> > >> > >> > > > Guozhang/Dong,
> > >> > >> > > >
> > >> > >> > > > Thank you for the feedback.
> > >> > >> > > >
> > >> > >> > > > Guozhang : I have updated the section on co-existence of
> byte
> > >> rate
> > >> > >> and
> > >> > >> > > > request time quotas.
> > >> > >> > > >
> > >> > >> > > > Dong: I hadn't added much detail to the metrics and sensors
> > >> since
> > >> > >> they
> > >> > >> > > are
> > >> > >> > > > going to be very similar to the existing metrics and
> sensors.
> > >> To
> > >> > >> avoid
> > >> > >> > > > confusion, I have now added more detail. All metrics are in
> > the
> > >> > >> group
> > >> > >> > > > "quotaType" and all sensors have names starting with
> > >> "quotaType"
> > >> > >> (where
> > >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > >> > >> > > > FollowerReplication/*IOThread*).
> > >> > >> > > > So there will be no reuse of existing metrics/sensors. The
> > new
> > >> > ones
> > >> > >> for
> > >> > >> > > > request processing time based throttling will be completely
> > >> > >> independent
> > >> > >> > > of
> > >> > >> > > > existing metrics/sensors, but will be consistent in format.
> > >> > >> > > >
> > >> > >> > > > The existing throttle_time_ms field in produce/fetch
> > responses
> > >> > will
> > >> > >> not
> > >> > >> > > be
> > >> > >> > > > impacted by this KIP. That will continue to return
> byte-rate
> > >> based
> > >> > >> > > > throttling times. In addition, a new field
> > >> > request_throttle_time_ms
> > >> > >> > will
> > >> > >> > > be
> > >> > >> > > > added to return request quota based throttling times. These
> > >> will
> > >> > be
> > >> > >> > > exposed
> > >> > >> > > > as new metrics on the client-side.
> > >> > >> > > >
> > >> > >> > > > Since all metrics and sensors are different for each type
> of
> > >> > quota,
> > >> > >> I
> > >> > >> > > > believe there is already sufficient metrics to monitor
> > >> throttling
> > >> > on
> > >> > >> > both
> > >> > >> > > > client and broker side for each type of throttling.
> > >> > >> > > >
> > >> > >> > > > Regards,
> > >> > >> > > >
> > >> > >> > > > Rajini
> > >> > >> > > >
> > >> > >> > > >
> > >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> > lindong28@gmail.com
> > >> >
> > >> > >> wrote:
> > >> > >> > > >
> > >> > >> > > > > Hey Rajini,
> > >> > >> > > > >
> > >> > >> > > > > I think it makes a lot of sense to use io_thread_units as
> > >> metric
> > >> > >> to
> > >> > >> > > quota
> > >> > >> > > > > user's traffic here. LGTM overall. I have some questions
> > >> > regarding
> > >> > >> > > > sensors.
> > >> > >> > > > >
> > >> > >> > > > > - Can you be more specific in the KIP what sensors will
> be
> > >> > added?
> > >> > >> For
> > >> > >> > > > > example, it will be useful to specify the name and
> > >> attributes of
> > >> > >> > these
> > >> > >> > > > new
> > >> > >> > > > > sensors.
> > >> > >> > > > >
> > >> > >> > > > > - We currently have throttle-time and queue-size for
> > >> byte-rate
> > >> > >> based
> > >> > >> > > > quota.
> > >> > >> > > > > Are you going to have separate throttle-time and
> queue-size
> > >> for
> > >> > >> > > requests
> > >> > >> > > > > throttled by io_thread_unit-based quota, or will they
> share
> > >> the
> > >> > >> same
> > >> > >> > > > > sensor?
> > >> > >> > > > >
> > >> > >> > > > > - Does the throttle-time in the ProduceResponse and
> > >> > FetchResponse
> > >> > >> > > > contains
> > >> > >> > > > > time due to io_thread_unit-based quota?
> > >> > >> > > > >
> > >> > >> > > > > - Currently kafka server doesn't not provide any log or
> > >> metrics
> > >> > >> that
> > >> > >> > > > tells
> > >> > >> > > > > whether any given clientId (or user) is throttled. This
> is
> > >> not
> > >> > too
> > >> > >> > bad
> > >> > >> > > > > because we can still check the client-side byte-rate
> metric
> > >> to
> > >> > >> > validate
> > >> > >> > > > > whether a given client is throttled. But with this
> > >> > io_thread_unit,
> > >> > >> > > there
> > >> > >> > > > > will be no way to validate whether a given client is slow
> > >> > because
> > >> > >> it
> > >> > >> > > has
> > >> > >> > > > > exceeded its io_thread_unit limit. It is necessary for
> user
> > >> to
> > >> > be
> > >> > >> > able
> > >> > >> > > to
> > >> > >> > > > > know this information to figure how whether they have
> > reached
> > >> > >> there
> > >> > >> > > quota
> > >> > >> > > > > limit. How about we add log4j log on the server side to
> > >> > >> periodically
> > >> > >> > > > print
> > >> > >> > > > > the (client_id, byte-rate-throttle-time,
> > >> > >> > io-thread-unit-throttle-time)
> > >> > >> > > so
> > >> > >> > > > > that kafka administrator can figure those users that have
> > >> > reached
> > >> > >> > their
> > >> > >> > > > > limit and act accordingly?
> > >> > >> > > > >
> > >> > >> > > > > Thanks,
> > >> > >> > > > > Dong
> > >> > >> > > > >
> > >> > >> > > > >
> > >> > >> > > > >
> > >> > >> > > > >
> > >> > >> > > > >
> > >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> > >> > >> wangguoz@gmail.com>
> > >> > >> > > > wrote:
> > >> > >> > > > >
> > >> > >> > > > > > Made a pass over the doc, overall LGTM except a minor
> > >> comment
> > >> > on
> > >> > >> > the
> > >> > >> > > > > > throttling implementation:
> > >> > >> > > > > >
> > >> > >> > > > > > Stated as "Request processing time throttling will be
> > >> applied
> > >> > on
> > >> > >> > top
> > >> > >> > > if
> > >> > >> > > > > > necessary." I thought that it meant the request
> > processing
> > >> > time
> > >> > >> > > > > throttling
> > >> > >> > > > > > is applied first, but continue reading I found it
> > actually
> > >> > >> meant to
> > >> > >> > > > apply
> > >> > >> > > > > > produce / fetch byte rate throttling first.
> > >> > >> > > > > >
> > >> > >> > > > > > Also the last sentence "The remaining delay if any is
> > >> applied
> > >> > to
> > >> > >> > the
> > >> > >> > > > > > response." is a bit confusing to me. Maybe rewording
> it a
> > >> bit?
> > >> > >> > > > > >
> > >> > >> > > > > >
> > >> > >> > > > > > Guozhang
> > >> > >> > > > > >
> > >> > >> > > > > >
> > >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> > jun@confluent.io
> > >> >
> > >> > >> wrote:
> > >> > >> > > > > >
> > >> > >> > > > > > > Hi, Rajini,
> > >> > >> > > > > > >
> > >> > >> > > > > > > Thanks for the updated KIP. The latest proposal looks
> > >> good
> > >> > to
> > >> > >> me.
> > >> > >> > > > > > >
> > >> > >> > > > > > > Jun
> > >> > >> > > > > > >
> > >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > >> > >> > > > > rajinisivaram@gmail.com
> > >> > >> > > > > > >
> > >> > >> > > > > > > wrote:
> > >> > >> > > > > > >
> > >> > >> > > > > > > > Jun/Roger,
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > Thank you for the feedback.
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > 1. I have updated the KIP to use absolute units
> > >> instead of
> > >> > >> > > > > percentage.
> > >> > >> > > > > > > The
> > >> > >> > > > > > > > property is called* io_thread_units* to align with
> > the
> > >> > >> thread
> > >> > >> > > count
> > >> > >> > > > > > > > property *num.io.threads*. When we implement
> network
> > >> > thread
> > >> > >> > > > > utilization
> > >> > >> > > > > > > > quotas, we can add another property
> > >> > *network_thread_units.*
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > 2. ControlledShutdown is already listed under the
> > >> exempt
> > >> > >> > > requests.
> > >> > >> > > > > Jun,
> > >> > >> > > > > > > did
> > >> > >> > > > > > > > you mean a different request that needs to be
> added?
> > >> The
> > >> > >> four
> > >> > >> > > > > requests
> > >> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> > >> > >> > ControlledShutdown,
> > >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are
> controlled
> > >> > using
> > >> > >> > > > > > ClusterAction
> > >> > >> > > > > > > > ACL, so it is easy to exclude and only throttle if
> > >> > >> > unauthorized.
> > >> > >> > > I
> > >> > >> > > > > > wasn't
> > >> > >> > > > > > > > sure if there are other requests used only for
> > >> > inter-broker
> > >> > >> > that
> > >> > >> > > > > needed
> > >> > >> > > > > > > to
> > >> > >> > > > > > > > be excluded.
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > 3. I was thinking the smallest change would be to
> > >> replace
> > >> > >> all
> > >> > >> > > > > > references
> > >> > >> > > > > > > to
> > >> > >> > > > > > > > *requestChannel.sendResponse()* with a local
> method
> > >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the
> > throttling
> > >> if
> > >> > >> any
> > >> > >> > > plus
> > >> > >> > > > > send
> > >> > >> > > > > > > > response. If we throttle first in
> > *KafkaApis.handle()*,
> > >> > the
> > >> > >> > time
> > >> > >> > > > > spent
> > >> > >> > > > > > > > within the method handling the request will not be
> > >> > recorded
> > >> > >> or
> > >> > >> > > used
> > >> > >> > > > > in
> > >> > >> > > > > > > > throttling. We can look into this again when the PR
> > is
> > >> > ready
> > >> > >> > for
> > >> > >> > > > > > review.
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > Regards,
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > Rajini
> > >> > >> > > > > > > >
> > >> > >> > > > > > > >
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > >> > >> > > > > roger.hoover@gmail.com>
> > >> > >> > > > > > > > wrote:
> > >> > >> > > > > > > >
> > >> > >> > > > > > > > > Great to see this KIP and the excellent
> discussion.
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
> > >> application
> > >> > is
> > >> > >> > > > > allocated
> > >> > >> > > > > > 1
> > >> > >> > > > > > > > > request handler unit, then it's as if I have a
> > Kafka
> > >> > >> broker
> > >> > >> > > with
> > >> > >> > > > a
> > >> > >> > > > > > > single
> > >> > >> > > > > > > > > request handler thread dedicated to me.  That's
> the
> > >> > most I
> > >> > >> > can
> > >> > >> > > > use,
> > >> > >> > > > > > at
> > >> > >> > > > > > > > > least.  That allocation doesn't change even if an
> > >> admin
> > >> > >> later
> > >> > >> > > > > > increases
> > >> > >> > > > > > > > the
> > >> > >> > > > > > > > > size of the request thread pool on the broker.
> > It's
> > >> > >> similar
> > >> > >> > to
> > >> > >> > > > the
> > >> > >> > > > > > CPU
> > >> > >> > > > > > > > > abstraction that VMs and containers get from
> > >> hypervisors
> > >> > >> or
> > >> > >> > OS
> > >> > >> > > > > > > > schedulers.
> > >> > >> > > > > > > > > While different client access patterns can use
> > wildly
> > >> > >> > different
> > >> > >> > > > > > amounts
> > >> > >> > > > > > > > of
> > >> > >> > > > > > > > > request thread resources per request, a given
> > >> > application
> > >> > >> > will
> > >> > >> > > > > > > generally
> > >> > >> > > > > > > > > have a stable access pattern and can figure out
> > >> > >> empirically
> > >> > >> > how
> > >> > >> > > > > many
> > >> > >> > > > > > > > > "request thread units" it needs to meet it's
> > >> > >> > throughput/latency
> > >> > >> > > > > > goals.
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > Cheers,
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > Roger
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> > >> > >> jun@confluent.io>
> > >> > >> > > > wrote:
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > > Hi, Rajini,
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > Thanks for the updated KIP. A few more
> comments.
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > 1. A concern of request_time_percent is that
> it's
> > >> not
> > >> > an
> > >> > >> > > > absolute
> > >> > >> > > > > > > > value.
> > >> > >> > > > > > > > > > Let's say you give a user a 10% limit. If the
> > admin
> > >> > >> doubles
> > >> > >> > > the
> > >> > >> > > > > > > number
> > >> > >> > > > > > > > of
> > >> > >> > > > > > > > > > request handler threads, that user now actually
> > has
> > >> > >> twice
> > >> > >> > the
> > >> > >> > > > > > > absolute
> > >> > >> > > > > > > > > > capacity. This may confuse people a bit. So,
> > >> perhaps
> > >> > >> > setting
> > >> > >> > > > the
> > >> > >> > > > > > > quota
> > >> > >> > > > > > > > > > based on an absolute request thread unit is
> > better.
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
> > >> inter-broker
> > >> > >> > request
> > >> > >> > > > and
> > >> > >> > > > > > > needs
> > >> > >> > > > > > > > to
> > >> > >> > > > > > > > > > be excluded from throttling.
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > 3. Implementation wise, I am wondering if it's
> > >> simpler
> > >> > >> to
> > >> > >> > > apply
> > >> > >> > > > > the
> > >> > >> > > > > > > > > request
> > >> > >> > > > > > > > > > time throttling first in KafkaApis.handle().
> > >> > Otherwise,
> > >> > >> we
> > >> > >> > > will
> > >> > >> > > > > > need
> > >> > >> > > > > > > to
> > >> > >> > > > > > > > > add
> > >> > >> > > > > > > > > > the throttling logic in each type of request.
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > Thanks,
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > Jun
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini
> Sivaram <
> > >> > >> > > > > > > > rajinisivaram@gmail.com
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > wrote:
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > > > > Jun,
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > Thank you for the review.
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > I have reverted to the original KIP that
> > >> throttles
> > >> > >> based
> > >> > >> > on
> > >> > >> > > > > > request
> > >> > >> > > > > > > > > > handler
> > >> > >> > > > > > > > > > > utilization. At the moment, it uses
> percentage,
> > >> but
> > >> > I
> > >> > >> am
> > >> > >> > > > happy
> > >> > >> > > > > to
> > >> > >> > > > > > > > > change
> > >> > >> > > > > > > > > > to
> > >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
> > >> required. I
> > >> > >> have
> > >> > >> > > > added
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > examples
> > >> > >> > > > > > > > > > > from this discussion to the KIP. Also added a
> > >> > "Future
> > >> > >> > Work"
> > >> > >> > > > > > section
> > >> > >> > > > > > > > to
> > >> > >> > > > > > > > > > > address network thread utilization. The
> > >> > configuration
> > >> > >> is
> > >> > >> > > > named
> > >> > >> > > > > > > > > > > "request_time_percent" with the expectation
> > that
> > >> it
> > >> > >> can
> > >> > >> > > also
> > >> > >> > > > be
> > >> > >> > > > > > > used
> > >> > >> > > > > > > > as
> > >> > >> > > > > > > > > > the
> > >> > >> > > > > > > > > > > limit for network thread utilization when
> that
> > is
> > >> > >> > > > implemented,
> > >> > >> > > > > so
> > >> > >> > > > > > > > that
> > >> > >> > > > > > > > > > > users have to set only one config for the two
> > and
> > >> > not
> > >> > >> > have
> > >> > >> > > to
> > >> > >> > > > > > worry
> > >> > >> > > > > > > > > about
> > >> > >> > > > > > > > > > > the internal distribution of the work between
> > the
> > >> > two
> > >> > >> > > thread
> > >> > >> > > > > > pools
> > >> > >> > > > > > > in
> > >> > >> > > > > > > > > > > Kafka.
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > Regards,
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > Rajini
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
> > >> > >> > > jun@confluent.io>
> > >> > >> > > > > > > wrote:
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Hi, Rajini,
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Thanks for the proposal.
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > The benefit of using the request processing
> > >> time
> > >> > >> over
> > >> > >> > the
> > >> > >> > > > > > request
> > >> > >> > > > > > > > > rate
> > >> > >> > > > > > > > > > is
> > >> > >> > > > > > > > > > > > exactly what people have said. I will just
> > >> expand
> > >> > >> that
> > >> > >> > a
> > >> > >> > > > bit.
> > >> > >> > > > > > > > > Consider
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > following case. The producer sends a
> produce
> > >> > request
> > >> > >> > > with a
> > >> > >> > > > > > 10MB
> > >> > >> > > > > > > > > > message
> > >> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> > >> > >> decompression of
> > >> > >> > > the
> > >> > >> > > > > > > message
> > >> > >> > > > > > > > > on
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > broker could take 10-15 seconds, during
> which
> > >> > time,
> > >> > >> a
> > >> > >> > > > request
> > >> > >> > > > > > > > handler
> > >> > >> > > > > > > > > > > > thread is completely blocked. In this case,
> > >> > neither
> > >> > >> the
> > >> > >> > > > > byte-in
> > >> > >> > > > > > > > quota
> > >> > >> > > > > > > > > > nor
> > >> > >> > > > > > > > > > > > the request rate quota may be effective in
> > >> > >> protecting
> > >> > >> > the
> > >> > >> > > > > > broker.
> > >> > >> > > > > > > > > > > Consider
> > >> > >> > > > > > > > > > > > another case. A consumer group starts with
> 10
> > >> > >> instances
> > >> > >> > > and
> > >> > >> > > > > > later
> > >> > >> > > > > > > > on
> > >> > >> > > > > > > > > > > > switches to 20 instances. The request rate
> > will
> > >> > >> likely
> > >> > >> > > > > double,
> > >> > >> > > > > > > but
> > >> > >> > > > > > > > > the
> > >> > >> > > > > > > > > > > > actually load on the broker may not double
> > >> since
> > >> > >> each
> > >> > >> > > fetch
> > >> > >> > > > > > > request
> > >> > >> > > > > > > > > > only
> > >> > >> > > > > > > > > > > > contains half of the partitions. Request
> rate
> > >> > quota
> > >> > >> may
> > >> > >> > > not
> > >> > >> > > > > be
> > >> > >> > > > > > > easy
> > >> > >> > > > > > > > > to
> > >> > >> > > > > > > > > > > > configure in this case.
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > What we really want is to be able to
> prevent
> > a
> > >> > >> client
> > >> > >> > > from
> > >> > >> > > > > > using
> > >> > >> > > > > > > > too
> > >> > >> > > > > > > > > > much
> > >> > >> > > > > > > > > > > > of the server side resources. In this
> > >> particular
> > >> > >> KIP,
> > >> > >> > > this
> > >> > >> > > > > > > resource
> > >> > >> > > > > > > > > is
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > capacity of the request handler threads. I
> > >> agree
> > >> > >> that
> > >> > >> > it
> > >> > >> > > > may
> > >> > >> > > > > > not
> > >> > >> > > > > > > be
> > >> > >> > > > > > > > > > > > intuitive for the users to determine how to
> > set
> > >> > the
> > >> > >> > right
> > >> > >> > > > > > limit.
> > >> > >> > > > > > > > > > However,
> > >> > >> > > > > > > > > > > > this is not completely new and has been
> done
> > in
> > >> > the
> > >> > >> > > > container
> > >> > >> > > > > > > world
> > >> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> > >> > >> > > > > https://access.redhat.com/
> > >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > >> > >> terprise_Linux/6/html/
> > >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html)
> has
> > >> the
> > >> > >> > concept
> > >> > >> > > of
> > >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > >> > >> > > > > > > > > > > > which specifies the total amount of time in
> > >> > >> > microseconds
> > >> > >> > > > for
> > >> > >> > > > > > > which
> > >> > >> > > > > > > > > all
> > >> > >> > > > > > > > > > > > tasks in a cgroup can run during a one
> second
> > >> > >> period.
> > >> > >> > We
> > >> > >> > > > can
> > >> > >> > > > > > > > > > potentially
> > >> > >> > > > > > > > > > > > model the request handler threads in a
> > similar
> > >> > way.
> > >> > >> For
> > >> > >> > > > > > example,
> > >> > >> > > > > > > > each
> > >> > >> > > > > > > > > > > > request handler thread can be 1 request
> > handler
> > >> > unit
> > >> > >> > and
> > >> > >> > > > the
> > >> > >> > > > > > > admin
> > >> > >> > > > > > > > > can
> > >> > >> > > > > > > > > > > > configure a limit on how many units (say
> > 0.01)
> > >> a
> > >> > >> client
> > >> > >> > > can
> > >> > >> > > > > > have.
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Regarding not throttling the internal
> broker
> > to
> > >> > >> broker
> > >> > >> > > > > > requests.
> > >> > >> > > > > > > We
> > >> > >> > > > > > > > > > could
> > >> > >> > > > > > > > > > > > do that. Alternatively, we could just let
> the
> > >> > admin
> > >> > >> > > > > configure a
> > >> > >> > > > > > > > high
> > >> > >> > > > > > > > > > > limit
> > >> > >> > > > > > > > > > > > for the kafka user (it may not be able to
> do
> > >> that
> > >> > >> > easily
> > >> > >> > > > > based
> > >> > >> > > > > > on
> > >> > >> > > > > > > > > > > clientId
> > >> > >> > > > > > > > > > > > though).
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Ideally we want to be able to protect the
> > >> > >> utilization
> > >> > >> > of
> > >> > >> > > > the
> > >> > >> > > > > > > > network
> > >> > >> > > > > > > > > > > thread
> > >> > >> > > > > > > > > > > > pool too. The difficult is mostly what
> Rajini
> > >> > said:
> > >> > >> (1)
> > >> > >> > > The
> > >> > >> > > > > > > > mechanism
> > >> > >> > > > > > > > > > for
> > >> > >> > > > > > > > > > > > throttling the requests is through
> Purgatory
> > >> and
> > >> > we
> > >> > >> > will
> > >> > >> > > > have
> > >> > >> > > > > > to
> > >> > >> > > > > > > > > think
> > >> > >> > > > > > > > > > > > through how to integrate that into the
> > network
> > >> > >> layer.
> > >> > >> > > (2)
> > >> > >> > > > In
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > network
> > >> > >> > > > > > > > > > > > layer, currently we know the user, but not
> > the
> > >> > >> clientId
> > >> > >> > > of
> > >> > >> > > > > the
> > >> > >> > > > > > > > > request.
> > >> > >> > > > > > > > > > > So,
> > >> > >> > > > > > > > > > > > it's a bit tricky to throttle based on
> > clientId
> > >> > >> there.
> > >> > >> > > > Plus,
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > byteOut
> > >> > >> > > > > > > > > > > > quota can already protect the network
> thread
> > >> > >> > utilization
> > >> > >> > > > for
> > >> > >> > > > > > > fetch
> > >> > >> > > > > > > > > > > > requests. So, if we can't figure out this
> > part
> > >> > right
> > >> > >> > now,
> > >> > >> > > > > just
> > >> > >> > > > > > > > > focusing
> > >> > >> > > > > > > > > > > on
> > >> > >> > > > > > > > > > > > the request handling threads for this KIP
> is
> > >> > still a
> > >> > >> > > useful
> > >> > >> > > > > > > > feature.
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Thanks,
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > Jun
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini
> > >> Sivaram <
> > >> > >> > > > > > > > > > rajinisivaram@gmail.com
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > Thank you all for the feedback.
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > Jay: I have removed exemption for
> consumer
> > >> > >> heartbeat
> > >> > >> > > etc.
> > >> > >> > > > > > Agree
> > >> > >> > > > > > > > > that
> > >> > >> > > > > > > > > > > > > protecting the cluster is more important
> > than
> > >> > >> > > protecting
> > >> > >> > > > > > > > individual
> > >> > >> > > > > > > > > > > apps.
> > >> > >> > > > > > > > > > > > > Have retained the exemption for
> > >> > >> > > StopReplicat/LeaderAndIsr
> > >> > >> > > > > > etc,
> > >> > >> > > > > > > > > these
> > >> > >> > > > > > > > > > > are
> > >> > >> > > > > > > > > > > > > throttled only if authorization fails (so
> > >> can't
> > >> > be
> > >> > >> > used
> > >> > >> > > > for
> > >> > >> > > > > > DoS
> > >> > >> > > > > > > > > > attacks
> > >> > >> > > > > > > > > > > > in
> > >> > >> > > > > > > > > > > > > a secure cluster, but allows inter-broker
> > >> > >> requests to
> > >> > >> > > > > > complete
> > >> > >> > > > > > > > > > without
> > >> > >> > > > > > > > > > > > > delays).
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > I will wait another day to see if these
> is
> > >> any
> > >> > >> > > objection
> > >> > >> > > > to
> > >> > >> > > > > > > > quotas
> > >> > >> > > > > > > > > > > based
> > >> > >> > > > > > > > > > > > on
> > >> > >> > > > > > > > > > > > > request processing time (as opposed to
> > >> request
> > >> > >> rate)
> > >> > >> > > and
> > >> > >> > > > if
> > >> > >> > > > > > > there
> > >> > >> > > > > > > > > are
> > >> > >> > > > > > > > > > > no
> > >> > >> > > > > > > > > > > > > objections, I will revert to the original
> > >> > proposal
> > >> > >> > with
> > >> > >> > > > > some
> > >> > >> > > > > > > > > changes.
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > The original proposal was only including
> > the
> > >> > time
> > >> > >> > used
> > >> > >> > > by
> > >> > >> > > > > the
> > >> > >> > > > > > > > > request
> > >> > >> > > > > > > > > > > > > handler threads (that made calculation
> > >> easy). I
> > >> > >> think
> > >> > >> > > the
> > >> > >> > > > > > > > > suggestion
> > >> > >> > > > > > > > > > is
> > >> > >> > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > include the time spent in the network
> > >> threads as
> > >> > >> well
> > >> > >> > > > since
> > >> > >> > > > > > > that
> > >> > >> > > > > > > > > may
> > >> > >> > > > > > > > > > be
> > >> > >> > > > > > > > > > > > > significant. As Jay pointed out, it is
> more
> > >> > >> > complicated
> > >> > >> > > > to
> > >> > >> > > > > > > > > calculate
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > total available CPU time and convert to a
> > >> ratio
> > >> > >> when
> > >> > >> > > > there
> > >> > >> > > > > > *m*
> > >> > >> > > > > > > > I/O
> > >> > >> > > > > > > > > > > > threads
> > >> > >> > > > > > > > > > > > > and *n* network threads.
> > >> > >> > ThreadMXBean#getThreadCPUTime(
> > >> > >> > > )
> > >> > >> > > > > may
> > >> > >> > > > > > > > give
> > >> > >> > > > > > > > > us
> > >> > >> > > > > > > > > > > > what
> > >> > >> > > > > > > > > > > > > we want, but it can be very expensive on
> > some
> > >> > >> > > platforms.
> > >> > >> > > > As
> > >> > >> > > > > > > > Becket
> > >> > >> > > > > > > > > > and
> > >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have
> > several
> > >> > time
> > >> > >> > > > > > measurements
> > >> > >> > > > > > > > > > already
> > >> > >> > > > > > > > > > > > for
> > >> > >> > > > > > > > > > > > > generating metrics that we could use,
> > though
> > >> we
> > >> > >> might
> > >> > >> > > > want
> > >> > >> > > > > to
> > >> > >> > > > > > > > > switch
> > >> > >> > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis()
> > >> since
> > >> > >> some
> > >> > >> > of
> > >> > >> > > > the
> > >> > >> > > > > > > > values
> > >> > >> > > > > > > > > > for
> > >> > >> > > > > > > > > > > > > small requests may be < 1ms. But rather
> > than
> > >> add
> > >> > >> up
> > >> > >> > the
> > >> > >> > > > > time
> > >> > >> > > > > > > > spent
> > >> > >> > > > > > > > > in
> > >> > >> > > > > > > > > > > I/O
> > >> > >> > > > > > > > > > > > > thread and network thread, wouldn't it be
> > >> better
> > >> > >> to
> > >> > >> > > > convert
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > time
> > >> > >> > > > > > > > > > > > spent
> > >> > >> > > > > > > > > > > > > on each thread into a separate ratio?
> UserA
> > >> has
> > >> > a
> > >> > >> > > request
> > >> > >> > > > > > quota
> > >> > >> > > > > > > > of
> > >> > >> > > > > > > > > > 5%.
> > >> > >> > > > > > > > > > > > Can
> > >> > >> > > > > > > > > > > > > we take that to mean that UserA can use
> 5%
> > of
> > >> > the
> > >> > >> > time
> > >> > >> > > on
> > >> > >> > > > > > > network
> > >> > >> > > > > > > > > > > threads
> > >> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If
> > either
> > >> is
> > >> > >> > > exceeded,
> > >> > >> > > > > the
> > >> > >> > > > > > > > > > response
> > >> > >> > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > throttled - it would mean maintaining two
> > >> sets
> > >> > of
> > >> > >> > > metrics
> > >> > >> > > > > for
> > >> > >> > > > > > > the
> > >> > >> > > > > > > > > two
> > >> > >> > > > > > > > > > > > > durations, but would result in more
> > >> meaningful
> > >> > >> > ratios.
> > >> > >> > > We
> > >> > >> > > > > > could
> > >> > >> > > > > > > > > > define
> > >> > >> > > > > > > > > > > > two
> > >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request
> > threads
> > >> > and
> > >> > >> 10%
> > >> > >> > > of
> > >> > >> > > > > > > network
> > >> > >> > > > > > > > > > > > threads),
> > >> > >> > > > > > > > > > > > > but that seems unnecessary and harder to
> > >> explain
> > >> > >> to
> > >> > >> > > > users.
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > Back to why and how quotas are applied to
> > >> > network
> > >> > >> > > thread
> > >> > >> > > > > > > > > utilization:
> > >> > >> > > > > > > > > > > > > a) In the case of fetch,  the time spent
> in
> > >> the
> > >> > >> > network
> > >> > >> > > > > > thread
> > >> > >> > > > > > > > may
> > >> > >> > > > > > > > > be
> > >> > >> > > > > > > > > > > > > significant and I can see the need to
> > include
> > >> > >> this.
> > >> > >> > Are
> > >> > >> > > > > there
> > >> > >> > > > > > > > other
> > >> > >> > > > > > > > > > > > > requests where the network thread
> > >> utilization is
> > >> > >> > > > > significant?
> > >> > >> > > > > > > In
> > >> > >> > > > > > > > > the
> > >> > >> > > > > > > > > > > case
> > >> > >> > > > > > > > > > > > > of fetch, request handler thread
> > utilization
> > >> > would
> > >> > >> > > > throttle
> > >> > >> > > > > > > > clients
> > >> > >> > > > > > > > > > > with
> > >> > >> > > > > > > > > > > > > high request rate, low data volume and
> > fetch
> > >> > byte
> > >> > >> > rate
> > >> > >> > > > > quota
> > >> > >> > > > > > > will
> > >> > >> > > > > > > > > > > > throttle
> > >> > >> > > > > > > > > > > > > clients with high data volume. Network
> > thread
> > >> > >> > > utilization
> > >> > >> > > > > is
> > >> > >> > > > > > > > > perhaps
> > >> > >> > > > > > > > > > > > > proportional to the data volume. I am
> > >> wondering
> > >> > >> if we
> > >> > >> > > > even
> > >> > >> > > > > > need
> > >> > >> > > > > > > > to
> > >> > >> > > > > > > > > > > > throttle
> > >> > >> > > > > > > > > > > > > based on network thread utilization or
> > >> whether
> > >> > the
> > >> > >> > data
> > >> > >> > > > > > volume
> > >> > >> > > > > > > > > quota
> > >> > >> > > > > > > > > > > > covers
> > >> > >> > > > > > > > > > > > > this case.
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > b) At the moment, we record and check for
> > >> quota
> > >> > >> > > violation
> > >> > >> > > > > at
> > >> > >> > > > > > > the
> > >> > >> > > > > > > > > same
> > >> > >> > > > > > > > > > > > time.
> > >> > >> > > > > > > > > > > > > If a quota is violated, the response is
> > >> delayed.
> > >> > >> > Using
> > >> > >> > > > > Jay'e
> > >> > >> > > > > > > > > example
> > >> > >> > > > > > > > > > of
> > >> > >> > > > > > > > > > > > > disk reads for fetches happening in the
> > >> network
> > >> > >> > thread,
> > >> > >> > > > We
> > >> > >> > > > > > > can't
> > >> > >> > > > > > > > > > record
> > >> > >> > > > > > > > > > > > and
> > >> > >> > > > > > > > > > > > > delay a response after the disk reads. We
> > >> could
> > >> > >> > record
> > >> > >> > > > the
> > >> > >> > > > > > time
> > >> > >> > > > > > > > > spent
> > >> > >> > > > > > > > > > > on
> > >> > >> > > > > > > > > > > > > the network thread when the response is
> > >> complete
> > >> > >> and
> > >> > >> > > > > > introduce
> > >> > >> > > > > > > a
> > >> > >> > > > > > > > > > delay
> > >> > >> > > > > > > > > > > > for
> > >> > >> > > > > > > > > > > > > handling a subsequent request (separate
> out
> > >> > >> recording
> > >> > >> > > and
> > >> > >> > > > > > quota
> > >> > >> > > > > > > > > > > violation
> > >> > >> > > > > > > > > > > > > handling in the case of network thread
> > >> > overload).
> > >> > >> > Does
> > >> > >> > > > that
> > >> > >> > > > > > > make
> > >> > >> > > > > > > > > > sense?
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > Regards,
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > Rajini
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket
> > Qin <
> > >> > >> > > > > > > > becket.qin@gmail.com>
> > >> > >> > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > Hey Jay,
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU
> time
> > >> is a
> > >> > >> > little
> > >> > >> > > > > > > tricky. I
> > >> > >> > > > > > > > > am
> > >> > >> > > > > > > > > > > > > thinking
> > >> > >> > > > > > > > > > > > > > that maybe we can use the existing
> > request
> > >> > >> > > statistics.
> > >> > >> > > > > They
> > >> > >> > > > > > > are
> > >> > >> > > > > > > > > > > already
> > >> > >> > > > > > > > > > > > > > very detailed so we can probably see
> the
> > >> > >> > approximate
> > >> > >> > > > CPU
> > >> > >> > > > > > time
> > >> > >> > > > > > > > > from
> > >> > >> > > > > > > > > > > it,
> > >> > >> > > > > > > > > > > > > e.g.
> > >> > >> > > > > > > > > > > > > > something like (total_time -
> > >> > >> > > > request/response_queue_time
> > >> > >> > > > > -
> > >> > >> > > > > > > > > > > > remote_time).
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > I agree with Guozhang that when a user
> is
> > >> > >> throttled
> > >> > >> > > it
> > >> > >> > > > is
> > >> > >> > > > > > > > likely
> > >> > >> > > > > > > > > > that
> > >> > >> > > > > > > > > > > > we
> > >> > >> > > > > > > > > > > > > > need to see if anything has went wrong
> > >> first,
> > >> > >> and
> > >> > >> > if
> > >> > >> > > > the
> > >> > >> > > > > > > users
> > >> > >> > > > > > > > > are
> > >> > >> > > > > > > > > > > well
> > >> > >> > > > > > > > > > > > > > behaving and just need more resources,
> we
> > >> will
> > >> > >> have
> > >> > >> > > to
> > >> > >> > > > > bump
> > >> > >> > > > > > > up
> > >> > >> > > > > > > > > the
> > >> > >> > > > > > > > > > > > quota
> > >> > >> > > > > > > > > > > > > > for them. It is true that
> pre-allocating
> > >> CPU
> > >> > >> time
> > >> > >> > > quota
> > >> > >> > > > > > > > precisely
> > >> > >> > > > > > > > > > for
> > >> > >> > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > users is difficult. So in practice it
> > would
> > >> > >> > probably
> > >> > >> > > be
> > >> > >> > > > > > more
> > >> > >> > > > > > > > like
> > >> > >> > > > > > > > > > > first
> > >> > >> > > > > > > > > > > > > set
> > >> > >> > > > > > > > > > > > > > a relative high protective CPU time
> quota
> > >> for
> > >> > >> > > everyone
> > >> > >> > > > > and
> > >> > >> > > > > > > > > increase
> > >> > >> > > > > > > > > > > > that
> > >> > >> > > > > > > > > > > > > > for some individual clients on demand.
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > Thanks,
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM,
> Guozhang
> > >> > Wang <
> > >> > >> > > > > > > > > wangguoz@gmail.com
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > This is a great proposal, glad to see
> > it
> > >> > >> > happening.
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > I am inclined to the CPU throttling,
> or
> > >> more
> > >> > >> > > > > specifically
> > >> > >> > > > > > > > > > > processing
> > >> > >> > > > > > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > > ratio instead of the request rate
> > >> throttling
> > >> > >> as
> > >> > >> > > well.
> > >> > >> > > > > > > Becket
> > >> > >> > > > > > > > > has
> > >> > >> > > > > > > > > > > very
> > >> > >> > > > > > > > > > > > > > well
> > >> > >> > > > > > > > > > > > > > > summed my rationales above, and one
> > >> thing to
> > >> > >> add
> > >> > >> > > here
> > >> > >> > > > > is
> > >> > >> > > > > > > that
> > >> > >> > > > > > > > > the
> > >> > >> > > > > > > > > > > > > former
> > >> > >> > > > > > > > > > > > > > > has a good support for both
> "protecting
> > >> > >> against
> > >> > >> > > rogue
> > >> > >> > > > > > > > clients"
> > >> > >> > > > > > > > > as
> > >> > >> > > > > > > > > > > > well
> > >> > >> > > > > > > > > > > > > as
> > >> > >> > > > > > > > > > > > > > > "utilizing a cluster for
> multi-tenancy
> > >> > usage":
> > >> > >> > when
> > >> > >> > > > > > > thinking
> > >> > >> > > > > > > > > > about
> > >> > >> > > > > > > > > > > > how
> > >> > >> > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > explain this to the end users, I find
> > it
> > >> > >> actually
> > >> > >> > > > more
> > >> > >> > > > > > > > natural
> > >> > >> > > > > > > > > > than
> > >> > >> > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > request rate since as mentioned
> above,
> > >> > >> different
> > >> > >> > > > > requests
> > >> > >> > > > > > > > will
> > >> > >> > > > > > > > > > have
> > >> > >> > > > > > > > > > > > > quite
> > >> > >> > > > > > > > > > > > > > > different "cost", and Kafka today
> > already
> > >> > have
> > >> > >> > > > various
> > >> > >> > > > > > > > request
> > >> > >> > > > > > > > > > > types
> > >> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata,
> etc),
> > >> > >> because
> > >> > >> > of
> > >> > >> > > > that
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > rate
> > >> > >> > > > > > > > > > > > > > > throttling may not be as effective
> > >> unless it
> > >> > >> is
> > >> > >> > set
> > >> > >> > > > > very
> > >> > >> > > > > > > > > > > > > conservatively.
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > Regarding to user reactions when they
> > are
> > >> > >> > > throttled,
> > >> > >> > > > I
> > >> > >> > > > > > > think
> > >> > >> > > > > > > > it
> > >> > >> > > > > > > > > > may
> > >> > >> > > > > > > > > > > > > > differ
> > >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
> > discovered /
> > >> > >> guided
> > >> > >> > by
> > >> > >> > > > > > looking
> > >> > >> > > > > > > > at
> > >> > >> > > > > > > > > > > > relative
> > >> > >> > > > > > > > > > > > > > > metrics. So in other words users
> would
> > >> not
> > >> > >> expect
> > >> > >> > > to
> > >> > >> > > > > get
> > >> > >> > > > > > > > > > additional
> > >> > >> > > > > > > > > > > > > > > information by simply being told
> "hey,
> > >> you
> > >> > are
> > >> > >> > > > > > throttled",
> > >> > >> > > > > > > > > which
> > >> > >> > > > > > > > > > is
> > >> > >> > > > > > > > > > > > all
> > >> > >> > > > > > > > > > > > > > > what throttling does; they need to
> > take a
> > >> > >> > follow-up
> > >> > >> > > > > step
> > >> > >> > > > > > > and
> > >> > >> > > > > > > > > see
> > >> > >> > > > > > > > > > > > "hmm,
> > >> > >> > > > > > > > > > > > > > I'm
> > >> > >> > > > > > > > > > > > > > > throttled probably because of ..",
> > which
> > >> is
> > >> > by
> > >> > >> > > > looking
> > >> > >> > > > > at
> > >> > >> > > > > > > > other
> > >> > >> > > > > > > > > > > > metric
> > >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding
> the
> > >> > >> brokers
> > >> > >> > > with
> > >> > >> > > > > > > metadata
> > >> > >> > > > > > > > > > > > request,
> > >> > >> > > > > > > > > > > > > > > which are usually cheap to handle but
> > I'm
> > >> > >> sending
> > >> > >> > > > > > thousands
> > >> > >> > > > > > > > per
> > >> > >> > > > > > > > > > > > second;
> > >> > >> > > > > > > > > > > > > > or
> > >> > >> > > > > > > > > > > > > > > is it because I'm catching up and
> hence
> > >> > >> sending
> > >> > >> > > very
> > >> > >> > > > > > heavy
> > >> > >> > > > > > > > > > fetching
> > >> > >> > > > > > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > > > with large min.bytes, etc.
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > Regarding to the implementation, as
> > once
> > >> > >> > discussed
> > >> > >> > > > with
> > >> > >> > > > > > > Jun,
> > >> > >> > > > > > > > > this
> > >> > >> > > > > > > > > > > > seems
> > >> > >> > > > > > > > > > > > > > not
> > >> > >> > > > > > > > > > > > > > > very difficult since today we are
> > already
> > >> > >> > > collecting
> > >> > >> > > > > the
> > >> > >> > > > > > > > > "thread
> > >> > >> > > > > > > > > > > pool
> > >> > >> > > > > > > > > > > > > > > utilization" metrics, which is a
> single
> > >> > >> > percentage
> > >> > >> > > > > > > > > > > > "aggregateIdleMeter"
> > >> > >> > > > > > > > > > > > > > > value; but we are already effectively
> > >> > >> aggregating
> > >> > >> > > it
> > >> > >> > > > > for
> > >> > >> > > > > > > each
> > >> > >> > > > > > > > > > > > requests
> > >> > >> > > > > > > > > > > > > in
> > >> > >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just
> > >> extend
> > >> > >> it by
> > >> > >> > > > > > recording
> > >> > >> > > > > > > > the
> > >> > >> > > > > > > > > > > > source
> > >> > >> > > > > > > > > > > > > > > client id when handling them and
> > >> aggregating
> > >> > >> by
> > >> > >> > > > > clientId
> > >> > >> > > > > > as
> > >> > >> > > > > > > > > well
> > >> > >> > > > > > > > > > as
> > >> > >> > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > total aggregate.
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > Guozhang
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay
> > >> Kreps <
> > >> > >> > > > > > > jay@confluent.io
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > When I thought about it more
> deeply I
> > >> came
> > >> > >> > around
> > >> > >> > > > to
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > "percent
> > >> > >> > > > > > > > > > > > of
> > >> > >> > > > > > > > > > > > > > > > processing time" metric too. It
> > seems a
> > >> > lot
> > >> > >> > > closer
> > >> > >> > > > to
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > thing
> > >> > >> > > > > > > > > > > we
> > >> > >> > > > > > > > > > > > > > > actually
> > >> > >> > > > > > > > > > > > > > > > care about and need to protect. I
> > also
> > >> > think
> > >> > >> > this
> > >> > >> > > > > would
> > >> > >> > > > > > > be
> > >> > >> > > > > > > > a
> > >> > >> > > > > > > > > > very
> > >> > >> > > > > > > > > > > > > > useful
> > >> > >> > > > > > > > > > > > > > > > metric even in the absence of
> > >> throttling
> > >> > >> just
> > >> > >> > to
> > >> > >> > > > > debug
> > >> > >> > > > > > > > whose
> > >> > >> > > > > > > > > > > using
> > >> > >> > > > > > > > > > > > > > > > capacity.
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > Two problems to consider:
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > >    1. I agree that for the user it
> is
> > >> > >> > > > understandable
> > >> > >> > > > > > what
> > >> > >> > > > > > > > > lead
> > >> > >> > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > their
> > >> > >> > > > > > > > > > > > > > > >    being throttled, but it is a bit
> > >> hard
> > >> > to
> > >> > >> > > figure
> > >> > >> > > > > out
> > >> > >> > > > > > > the
> > >> > >> > > > > > > > > safe
> > >> > >> > > > > > > > > > > > range
> > >> > >> > > > > > > > > > > > > > for
> > >> > >> > > > > > > > > > > > > > > >    them. i.e. if I have a new app
> > that
> > >> > will
> > >> > >> > send
> > >> > >> > > > 200
> > >> > >> > > > > > > > > > > messages/sec I
> > >> > >> > > > > > > > > > > > > can
> > >> > >> > > > > > > > > > > > > > > >    probably reason that I'll be
> under
> > >> the
> > >> > >> > > > throttling
> > >> > >> > > > > > > limit
> > >> > >> > > > > > > > of
> > >> > >> > > > > > > > > > 300
> > >> > >> > > > > > > > > > > > > > > req/sec.
> > >> > >> > > > > > > > > > > > > > > >    However if I need to be under a
> > 10%
> > >> CPU
> > >> > >> > > > resources
> > >> > >> > > > > > > limit
> > >> > >> > > > > > > > it
> > >> > >> > > > > > > > > > may
> > >> > >> > > > > > > > > > > > be
> > >> > >> > > > > > > > > > > > > a
> > >> > >> > > > > > > > > > > > > > > bit
> > >> > >> > > > > > > > > > > > > > > >    harder for me to know a priori
> if
> > i
> > >> > will
> > >> > >> or
> > >> > >> > > > won't.
> > >> > >> > > > > > > > > > > > > > > >    2. Calculating the available CPU
> > >> time
> > >> > is
> > >> > >> a
> > >> > >> > bit
> > >> > >> > > > > > > difficult
> > >> > >> > > > > > > > > > since
> > >> > >> > > > > > > > > > > > > there
> > >> > >> > > > > > > > > > > > > > > are
> > >> > >> > > > > > > > > > > > > > > >    actually two thread pools--the
> I/O
> > >> > >> threads
> > >> > >> > and
> > >> > >> > > > the
> > >> > >> > > > > > > > network
> > >> > >> > > > > > > > > > > > > threads.
> > >> > >> > > > > > > > > > > > > > I
> > >> > >> > > > > > > > > > > > > > > > think
> > >> > >> > > > > > > > > > > > > > > >    it might be workable to count
> just
> > >> the
> > >> > >> I/O
> > >> > >> > > > thread
> > >> > >> > > > > > time
> > >> > >> > > > > > > > as
> > >> > >> > > > > > > > > in
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > > proposal,
> > >> > >> > > > > > > > > > > > > > > >    but the network thread work is
> > >> actually
> > >> > >> > > > > non-trivial
> > >> > >> > > > > > > > (e.g.
> > >> > >> > > > > > > > > > all
> > >> > >> > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > disk
> > >> > >> > > > > > > > > > > > > > > >    reads for fetches happen in that
> > >> > >> thread). If
> > >> > >> > > you
> > >> > >> > > > > > count
> > >> > >> > > > > > > > > both
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > network
> > >> > >> > > > > > > > > > > > > > > > and
> > >> > >> > > > > > > > > > > > > > > >    I/O threads it can skew things a
> > >> bit.
> > >> > >> E.g.
> > >> > >> > say
> > >> > >> > > > you
> > >> > >> > > > > > > have
> > >> > >> > > > > > > > 50
> > >> > >> > > > > > > > > > > > network
> > >> > >> > > > > > > > > > > > > > > > threads,
> > >> > >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores,
> what
> > is
> > >> > the
> > >> > >> > > > available
> > >> > >> > > > > > cpu
> > >> > >> > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > available
> > >> > >> > > > > > > > > > > > > > > > in a
> > >> > >> > > > > > > > > > > > > > > >    second? I suppose this is a
> > problem
> > >> > >> whenever
> > >> > >> > > you
> > >> > >> > > > > > have
> > >> > >> > > > > > > a
> > >> > >> > > > > > > > > > > > bottleneck
> > >> > >> > > > > > > > > > > > > > > > between
> > >> > >> > > > > > > > > > > > > > > >    I/O and network threads or if
> you
> > >> end
> > >> > up
> > >> > >> > > > > > significantly
> > >> > >> > > > > > > > > > > > > > > over-provisioning
> > >> > >> > > > > > > > > > > > > > > >    one pool (both of which are hard
> > to
> > >> > >> avoid).
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > An alternative for CPU throttling
> > >> would be
> > >> > >> to
> > >> > >> > use
> > >> > >> > > > > this
> > >> > >> > > > > > > api:
> > >> > >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> > >> > >> > > > > > 1.5.0/docs/api/java/lang/
> > >> > >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
> > >> > >> > > > getThreadCpuTime(long)
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > That would let you track actual CPU
> > >> usage
> > >> > >> > across
> > >> > >> > > > the
> > >> > >> > > > > > > > network,
> > >> > >> > > > > > > > > > I/O
> > >> > >> > > > > > > > > > > > > > > threads,
> > >> > >> > > > > > > > > > > > > > > > and purgatory threads and look at
> it
> > >> as a
> > >> > >> > > > percentage
> > >> > >> > > > > of
> > >> > >> > > > > > > > total
> > >> > >> > > > > > > > > > > > cores.
> > >> > >> > > > > > > > > > > > > I
> > >> > >> > > > > > > > > > > > > > > > think this fixes many problems in
> the
> > >> > >> > reliability
> > >> > >> > > > of
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > > metric.
> > >> > >> > > > > > > > > > > > It's
> > >> > >> > > > > > > > > > > > > > > > meaning is slightly different as it
> > is
> > >> > just
> > >> > >> CPU
> > >> > >> > > > (you
> > >> > >> > > > > > > don't
> > >> > >> > > > > > > > > get
> > >> > >> > > > > > > > > > > > > charged
> > >> > >> > > > > > > > > > > > > > > for
> > >> > >> > > > > > > > > > > > > > > > time blocking on I/O) but that may
> be
> > >> okay
> > >> > >> > > because
> > >> > >> > > > we
> > >> > >> > > > > > > > already
> > >> > >> > > > > > > > > > > have
> > >> > >> > > > > > > > > > > > a
> > >> > >> > > > > > > > > > > > > > > > throttle on I/O. The downside is I
> > >> think
> > >> > it
> > >> > >> is
> > >> > >> > > > > possible
> > >> > >> > > > > > > > this
> > >> > >> > > > > > > > > > api
> > >> > >> > > > > > > > > > > > can
> > >> > >> > > > > > > > > > > > > be
> > >> > >> > > > > > > > > > > > > > > > disabled or isn't always available
> > and
> > >> it
> > >> > >> may
> > >> > >> > > also
> > >> > >> > > > be
> > >> > >> > > > > > > > > expensive
> > >> > >> > > > > > > > > > > > (also
> > >> > >> > > > > > > > > > > > > > > I've
> > >> > >> > > > > > > > > > > > > > > > never used it so not sure if it
> > really
> > >> > works
> > >> > >> > the
> > >> > >> > > > way
> > >> > >> > > > > i
> > >> > >> > > > > > > > > think).
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > -Jay
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM,
> > Becket
> > >> > Qin
> > >> > >> <
> > >> > >> > > > > > > > > > > becket.qin@gmail.com>
> > >> > >> > > > > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > If the purpose of the KIP is only
> > to
> > >> > >> protect
> > >> > >> > > the
> > >> > >> > > > > > > cluster
> > >> > >> > > > > > > > > from
> > >> > >> > > > > > > > > > > > being
> > >> > >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and
> is
> > >> not
> > >> > >> > > intended
> > >> > >> > > > to
> > >> > >> > > > > > > > address
> > >> > >> > > > > > > > > > > > > resource
> > >> > >> > > > > > > > > > > > > > > > > allocation problem among the
> > >> clients, I
> > >> > am
> > >> > >> > > > > wondering
> > >> > >> > > > > > if
> > >> > >> > > > > > > > > using
> > >> > >> > > > > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > > > > > handling time quota (CPU time
> > quota)
> > >> is
> > >> > a
> > >> > >> > > better
> > >> > >> > > > > > > option.
> > >> > >> > > > > > > > > Here
> > >> > >> > > > > > > > > > > are
> > >> > >> > > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > > > reasons:
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > 1. request handling time quota
> has
> > >> > better
> > >> > >> > > > > protection.
> > >> > >> > > > > > > Say
> > >> > >> > > > > > > > > we
> > >> > >> > > > > > > > > > > have
> > >> > >> > > > > > > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > > > > > rate quota and set that to some
> > value
> > >> > like
> > >> > >> > 100
> > >> > >> > > > > > > > > requests/sec,
> > >> > >> > > > > > > > > > it
> > >> > >> > > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > > > > possible
> > >> > >> > > > > > > > > > > > > > > > > that some of the requests are
> very
> > >> > >> expensive
> > >> > >> > > > > actually
> > >> > >> > > > > > > > take
> > >> > >> > > > > > > > > a
> > >> > >> > > > > > > > > > > lot
> > >> > >> > > > > > > > > > > > of
> > >> > >> > > > > > > > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > > handle. In that case a few
> clients
> > >> may
> > >> > >> still
> > >> > >> > > > > occupy a
> > >> > >> > > > > > > lot
> > >> > >> > > > > > > > > of
> > >> > >> > > > > > > > > > > CPU
> > >> > >> > > > > > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > > > even
> > >> > >> > > > > > > > > > > > > > > > > the request rate is low. Arguably
> > we
> > >> can
> > >> > >> > > > carefully
> > >> > >> > > > > > set
> > >> > >> > > > > > > > > > request
> > >> > >> > > > > > > > > > > > rate
> > >> > >> > > > > > > > > > > > > > > quota
> > >> > >> > > > > > > > > > > > > > > > > for each request and client id
> > >> > >> combination,
> > >> > >> > but
> > >> > >> > > > it
> > >> > >> > > > > > > could
> > >> > >> > > > > > > > > > still
> > >> > >> > > > > > > > > > > be
> > >> > >> > > > > > > > > > > > > > > tricky
> > >> > >> > > > > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > > get it right for everyone.
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > If we use the request time
> handling
> > >> > >> quota, we
> > >> > >> > > can
> > >> > >> > > > > > > simply
> > >> > >> > > > > > > > > say
> > >> > >> > > > > > > > > > no
> > >> > >> > > > > > > > > > > > > > clients
> > >> > >> > > > > > > > > > > > > > > > can
> > >> > >> > > > > > > > > > > > > > > > > take up to more than 30% of the
> > total
> > >> > >> request
> > >> > >> > > > > > handling
> > >> > >> > > > > > > > > > capacity
> > >> > >> > > > > > > > > > > > > > > (measured
> > >> > >> > > > > > > > > > > > > > > > > by time), regardless of the
> > >> difference
> > >> > >> among
> > >> > >> > > > > > different
> > >> > >> > > > > > > > > > requests
> > >> > >> > > > > > > > > > > > or
> > >> > >> > > > > > > > > > > > > > what
> > >> > >> > > > > > > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > > > > > the client doing. In this case
> > maybe
> > >> we
> > >> > >> can
> > >> > >> > > quota
> > >> > >> > > > > all
> > >> > >> > > > > > > the
> > >> > >> > > > > > > > > > > > requests
> > >> > >> > > > > > > > > > > > > if
> > >> > >> > > > > > > > > > > > > > > we
> > >> > >> > > > > > > > > > > > > > > > > want to.
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > 2. The main benefit of using
> > request
> > >> > rate
> > >> > >> > limit
> > >> > >> > > > is
> > >> > >> > > > > > that
> > >> > >> > > > > > > > it
> > >> > >> > > > > > > > > > > seems
> > >> > >> > > > > > > > > > > > > more
> > >> > >> > > > > > > > > > > > > > > > > intuitive. It is true that it is
> > >> > probably
> > >> > >> > > easier
> > >> > >> > > > to
> > >> > >> > > > > > > > explain
> > >> > >> > > > > > > > > > to
> > >> > >> > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > user
> > >> > >> > > > > > > > > > > > > > > > > what does that mean. However, in
> > >> > practice
> > >> > >> it
> > >> > >> > > > looks
> > >> > >> > > > > > the
> > >> > >> > > > > > > > > impact
> > >> > >> > > > > > > > > > > of
> > >> > >> > > > > > > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > > > > > rate quota is not more
> quantifiable
> > >> than
> > >> > >> the
> > >> > >> > > > > request
> > >> > >> > > > > > > > > handling
> > >> > >> > > > > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > > quota.
> > >> > >> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is
> > >> still
> > >> > >> > > difficult
> > >> > >> > > > > to
> > >> > >> > > > > > > > give a
> > >> > >> > > > > > > > > > > > number
> > >> > >> > > > > > > > > > > > > > > about
> > >> > >> > > > > > > > > > > > > > > > > impact of throughput or latency
> > when
> > >> a
> > >> > >> > request
> > >> > >> > > > rate
> > >> > >> > > > > > > quota
> > >> > >> > > > > > > > > is
> > >> > >> > > > > > > > > > > hit.
> > >> > >> > > > > > > > > > > > > So
> > >> > >> > > > > > > > > > > > > > it
> > >> > >> > > > > > > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > > > > > not better than the request
> > handling
> > >> > time
> > >> > >> > > quota.
> > >> > >> > > > In
> > >> > >> > > > > > > fact
> > >> > >> > > > > > > > I
> > >> > >> > > > > > > > > > feel
> > >> > >> > > > > > > > > > > > it
> > >> > >> > > > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > > > > > clearer to tell user that "you
> are
> > >> > limited
> > >> > >> > > > because
> > >> > >> > > > > > you
> > >> > >> > > > > > > > have
> > >> > >> > > > > > > > > > > taken
> > >> > >> > > > > > > > > > > > > 30%
> > >> > >> > > > > > > > > > > > > > > of
> > >> > >> > > > > > > > > > > > > > > > > the CPU time on the broker" than
> > >> > otherwise
> > >> > >> > > > > something
> > >> > >> > > > > > > like
> > >> > >> > > > > > > > > > "your
> > >> > >> > > > > > > > > > > > > > request
> > >> > >> > > > > > > > > > > > > > > > > rate quota on metadata request
> has
> > >> > >> reached".
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > Thanks,
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM,
> > Jay
> > >> > >> Kreps <
> > >> > >> > > > > > > > > jay@confluent.io
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > I think this proposal makes a
> lot
> > >> of
> > >> > >> sense
> > >> > >> > > > > > > (especially
> > >> > >> > > > > > > > > now
> > >> > >> > > > > > > > > > > that
> > >> > >> > > > > > > > > > > > > it
> > >> > >> > > > > > > > > > > > > > is
> > >> > >> > > > > > > > > > > > > > > > > > oriented around request rate)
> and
> > >> > fills
> > >> > >> the
> > >> > >> > > > > biggest
> > >> > >> > > > > > > > > > remaining
> > >> > >> > > > > > > > > > > > gap
> > >> > >> > > > > > > > > > > > > > in
> > >> > >> > > > > > > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > > > > multi-tenancy story.
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > I think for intra-cluster
> > >> > communication
> > >> > >> > > > > > (StopReplica,
> > >> > >> > > > > > > > > etc)
> > >> > >> > > > > > > > > > we
> > >> > >> > > > > > > > > > > > > could
> > >> > >> > > > > > > > > > > > > > > > avoid
> > >> > >> > > > > > > > > > > > > > > > > > throttling entirely. You can
> > >> secure or
> > >> > >> > > > otherwise
> > >> > >> > > > > > > > > lock-down
> > >> > >> > > > > > > > > > > the
> > >> > >> > > > > > > > > > > > > > > cluster
> > >> > >> > > > > > > > > > > > > > > > > > communication to avoid any
> > >> > unauthorized
> > >> > >> > > > external
> > >> > >> > > > > > > party
> > >> > >> > > > > > > > > from
> > >> > >> > > > > > > > > > > > > trying
> > >> > >> > > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > > > initiate these requests. As a
> > >> result
> > >> > we
> > >> > >> are
> > >> > >> > > as
> > >> > >> > > > > > likely
> > >> > >> > > > > > > > to
> > >> > >> > > > > > > > > > > cause
> > >> > >> > > > > > > > > > > > > > > problems
> > >> > >> > > > > > > > > > > > > > > > > as
> > >> > >> > > > > > > > > > > > > > > > > > solve them by throttling these,
> > >> right?
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > I'm not so sure that we should
> > >> exempt
> > >> > >> the
> > >> > >> > > > > consumer
> > >> > >> > > > > > > > > requests
> > >> > >> > > > > > > > > > > > such
> > >> > >> > > > > > > > > > > > > as
> > >> > >> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we
> > >> > >> throttle an
> > >> > >> > > > app's
> > >> > >> > > > > > > > > heartbeat
> > >> > >> > > > > > > > > > > > > > requests
> > >> > >> > > > > > > > > > > > > > > it
> > >> > >> > > > > > > > > > > > > > > > > may
> > >> > >> > > > > > > > > > > > > > > > > > cause it to fall out of its
> > >> consumer
> > >> > >> group.
> > >> > >> > > > > However
> > >> > >> > > > > > > if
> > >> > >> > > > > > > > we
> > >> > >> > > > > > > > > > > don't
> > >> > >> > > > > > > > > > > > > > > > throttle
> > >> > >> > > > > > > > > > > > > > > > > it
> > >> > >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the
> > >> > heartbeat
> > >> > >> > > > interval
> > >> > >> > > > > > is
> > >> > >> > > > > > > > set
> > >> > >> > > > > > > > > > > > > > incorrectly
> > >> > >> > > > > > > > > > > > > > > or
> > >> > >> > > > > > > > > > > > > > > > > if
> > >> > >> > > > > > > > > > > > > > > > > > some client in some language
> has
> > a
> > >> > bug.
> > >> > >> I
> > >> > >> > > think
> > >> > >> > > > > the
> > >> > >> > > > > > > > > policy
> > >> > >> > > > > > > > > > > with
> > >> > >> > > > > > > > > > > > > > this
> > >> > >> > > > > > > > > > > > > > > > kind
> > >> > >> > > > > > > > > > > > > > > > > > of throttling is to protect the
> > >> > cluster
> > >> > >> > above
> > >> > >> > > > any
> > >> > >> > > > > > > > > > individual
> > >> > >> > > > > > > > > > > > app,
> > >> > >> > > > > > > > > > > > > > > > right?
> > >> > >> > > > > > > > > > > > > > > > > I
> > >> > >> > > > > > > > > > > > > > > > > > think in general this should be
> > >> okay
> > >> > >> since
> > >> > >> > > for
> > >> > >> > > > > most
> > >> > >> > > > > > > > > > > deployments
> > >> > >> > > > > > > > > > > > > > this
> > >> > >> > > > > > > > > > > > > > > > > > setting is meant as more of a
> > >> safety
> > >> > >> > > > valve---that
> > >> > >> > > > > > is
> > >> > >> > > > > > > > > rather
> > >> > >> > > > > > > > > > > > than
> > >> > >> > > > > > > > > > > > > > set
> > >> > >> > > > > > > > > > > > > > > > > > something very close to what
> you
> > >> > expect
> > >> > >> to
> > >> > >> > > need
> > >> > >> > > > > > (say
> > >> > >> > > > > > > 2
> > >> > >> > > > > > > > > > > req/sec
> > >> > >> > > > > > > > > > > > or
> > >> > >> > > > > > > > > > > > > > > > > whatever)
> > >> > >> > > > > > > > > > > > > > > > > > you would have something quite
> > high
> > >> > >> (like
> > >> > >> > 100
> > >> > >> > > > > > > req/sec)
> > >> > >> > > > > > > > > with
> > >> > >> > > > > > > > > > > > this
> > >> > >> > > > > > > > > > > > > > > meant
> > >> > >> > > > > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I
> > >> think
> > >> > >> when
> > >> > >> > > used
> > >> > >> > > > > this
> > >> > >> > > > > > > way
> > >> > >> > > > > > > > > > > > allowing
> > >> > >> > > > > > > > > > > > > > > those
> > >> > >> > > > > > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > > > be throttled would actually
> > provide
> > >> > >> > > meaningful
> > >> > >> > > > > > > > > protection.
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > -Jay
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05
> AM,
> > >> > Rajini
> > >> > >> > > > Sivaram <
> > >> > >> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > wrote:
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > Hi all,
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > I have just created KIP-124
> to
> > >> > >> introduce
> > >> > >> > > > > request
> > >> > >> > > > > > > rate
> > >> > >> > > > > > > > > > > quotas
> > >> > >> > > > > > > > > > > > to
> > >> > >> > > > > > > > > > > > > > > > Kafka:
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
> > >> > >> > > > > > > > confluence/display/KAFKA/KIP-
> > >> > >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > The proposal is for a simple
> > >> > >> percentage
> > >> > >> > > > request
> > >> > >> > > > > > > > > handling
> > >> > >> > > > > > > > > > > time
> > >> > >> > > > > > > > > > > > > > quota
> > >> > >> > > > > > > > > > > > > > > > > that
> > >> > >> > > > > > > > > > > > > > > > > > > can be allocated to
> > >> *<client-id>*,
> > >> > >> > *<user>*
> > >> > >> > > > or
> > >> > >> > > > > > > > *<user,
> > >> > >> > > > > > > > > > > > > > client-id>*.
> > >> > >> > > > > > > > > > > > > > > > > There
> > >> > >> > > > > > > > > > > > > > > > > > > are a few other suggestions
> > also
> > >> > under
> > >> > >> > > > > "Rejected
> > >> > >> > > > > > > > > > > > alternatives".
> > >> > >> > > > > > > > > > > > > > > > > Feedback
> > >> > >> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > Thank you...
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > Regards,
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > > > Rajini
> > >> > >> > > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > > > --
> > >> > >> > > > > > > > > > > > > > > -- Guozhang
> > >> > >> > > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > > >
> > >> > >> > > > > > > > > > > >
> > >> > >> > > > > > > > > > >
> > >> > >> > > > > > > > > >
> > >> > >> > > > > > > > >
> > >> > >> > > > > > > >
> > >> > >> > > > > > >
> > >> > >> > > > > >
> > >> > >> > > > > >
> > >> > >> > > > > >
> > >> > >> > > > > > --
> > >> > >> > > > > > -- Guozhang
> > >> > >> > > > > >
> > >> > >> > > > >
> > >> > >> > > >
> > >> > >> > >
> > >> > >> >
> > >> > >>
> > >> > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jay Kreps <ja...@confluent.io>.
A couple of quick points:

1. Even though the implementation of this quota is only using io thread
time, i think we should call it something like "request-time". This will
give us flexibility to improve the implementation to cover network threads
in the future and will avoid exposing internal details like our thread
pools on the server.

2. Jun/Roger, I get what you are trying to fix but the idea of thread/units
is super unintuitive as a user-facing knob. I had to read the KIP like
eight times to understand this. I'm not sure that your point that
increasing the number of threads is a problem with a percentage-based
value, it really depends on whether the user thinks about the "percentage
of request processing time" or "thread units". If they think "I have
allocated 10% of my request processing time to user x" then it is a bug
that increasing the thread count decreases that percent as it does in the
current proposal. As a practical matter I think the only way to actually
reason about this is as a percent---I just don't believe people are going
to think, "ah, 4.3 thread units, that is the right amount!". Instead I
think they have to understand this thread unit concept, figure out what
they have set in number of threads, compute a percent and then come up with
the number of thread units, and these will all be wrong if that thread
count changes. I also think this ties us to throttling the I/O thread pool,
which may not be where we want to end up.

3. For what it's worth I do think having a single throttle_ms field in all
the responses that combines all throttling from all quotas is probably the
simplest. There could be a use case for having separate fields for each,
but I think that is actually harder to use/monitor in the common case so
unless someone has a use case I think just one should be fine.

-Jay

On Fri, Feb 24, 2017 at 4:21 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> I have updated the KIP based on the discussions so far.
>
>
> Regards,
>
> Rajini
>
> On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Thank you all for the feedback.
> >
> > Ismael #1. It makes sense not to throttle inter-broker requests like
> > LeaderAndIsr etc. The simplest way to ensure that clients cannot use
> these
> > requests to bypass quotas for DoS attacks is to ensure that ACLs prevent
> > clients from using these requests and unauthorized requests are included
> > towards quotas.
> >
> > Ismael #2, Jay #1 : I was thinking that these quotas can return a
> separate
> > throttle time, and all utilization based quotas could use the same field
> > (we won't add another one for network thread utilization for instance).
> But
> > perhaps it makes sense to keep byte rate quotas separate in produce/fetch
> > responses to provide separate metrics? Agree with Ismael that the name of
> > the existing field should be changed if we have two. Happy to switch to a
> > single combined throttle time if that is sufficient.
> >
> > Ismael #4, #5, #6: Will update KIP. Will use dot separated name for new
> > property. Replication quotas use dot separated, so it will be consistent
> > with all properties except byte rate quotas.
> >
> > Radai: #1 Request processing time rather than request rate were chosen
> > because the time per request can vary significantly between requests as
> > mentioned in the discussion and KIP.
> > #2 Two separate quotas for heartbeats/regular requests feel like more
> > configuration and more metrics. Since most users would set quotas higher
> > than the expected usage and quotas are more of a safety net, a single
> quota
> > should work in most cases.
> >  #3 The number of requests in purgatory is limited by the number of
> active
> > connections since only one request per connection will be throttled at a
> > time.
> > #4 As with byte rate quotas, to use the full allocated quotas,
> > clients/users would need to use partitions that are distributed across
> the
> > cluster. The alternative of using cluster-wide quotas instead of
> per-broker
> > quotas would be far too complex to implement.
> >
> > Dong : We currently have two ClientQuotaManagers for quota types Fetch
> and
> > Produce. A new one will be added for IOThread, which manages quotas for
> I/O
> > thread utilization. This will not update the Fetch or Produce queue-size,
> > but will have a separate metric for the queue-size.  I wasn't planning to
> > add any additional metrics apart from the equivalent ones for existing
> > quotas as part of this KIP. Ratio of byte-rate to I/O thread utilization
> > could be slightly misleading since it depends on the sequence of
> requests.
> > But we can look into more metrics after the KIP is implemented if
> required.
> >
> > I think we need to limit the maximum delay since all requests are
> > throttled. If a client has a quota of 0.001 units and a single request
> used
> > 50ms, we don't want to delay all requests from the client by 50 seconds,
> > throwing the client out of all its consumer groups. The issue is only if
> a
> > user is allocated a quota that is insufficient to process one large
> > request. The expectation is that the units allocated per user will be
> much
> > higher than the time taken to process one request and the limit should
> > seldom be applied. Agree this needs proper documentation.
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Thu, Feb 23, 2017 at 8:04 PM, radai <ra...@gmail.com>
> wrote:
> >
> >> @jun: i wasnt concerned about tying up a request processing thread, but
> >> IIUC the code does still read the entire request out, which might add-up
> >> to
> >> a non-negligible amount of memory.
> >>
> >> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com> wrote:
> >>
> >> > Hey Rajini,
> >> >
> >> > The current KIP says that the maximum delay will be reduced to window
> >> size
> >> > if it is larger than the window size. I have a concern with this:
> >> >
> >> > 1) This essentially means that the user is allowed to exceed their
> quota
> >> > over a long period of time. Can you provide an upper bound on this
> >> > deviation?
> >> >
> >> > 2) What is the motivation for cap the maximum delay by the window
> size?
> >> I
> >> > am wondering if there is better alternative to address the problem.
> >> >
> >> > 3) It means that the existing metric-related config will have a more
> >> > directly impact on the mechanism of this io-thread-unit-based quota.
> The
> >> > may be an important change depending on the answer to 1) above. We
> >> probably
> >> > need to document this more explicitly.
> >> >
> >> > Dong
> >> >
> >> >
> >> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <li...@gmail.com>
> wrote:
> >> >
> >> > > Hey Jun,
> >> > >
> >> > > Yeah you are right. I thought it wasn't because at LinkedIn it will
> be
> >> > too
> >> > > much pressure on inGraph to expose those per-clientId metrics so we
> >> ended
> >> > > up printing them periodically to local log. Never mind if it is not
> a
> >> > > general problem.
> >> > >
> >> > > Hey Rajini,
> >> > >
> >> > > - I agree with Jay that we probably don't want to add a new field
> for
> >> > > every quota ProduceResponse or FetchResponse. Is there any use-case
> >> for
> >> > > having separate throttle-time fields for byte-rate-quota and
> >> > > io-thread-unit-quota? You probably need to document this as
> interface
> >> > > change if you plan to add new field in any request.
> >> > >
> >> > > - I don't think IOThread belongs to quotaType. The existing quota
> >> types
> >> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication) identify
> >> the
> >> > > type of request that are throttled, not the quota mechanism that is
> >> > applied.
> >> > >
> >> > > - If a request is throttled due to this io-thread-unit-based quota,
> is
> >> > the
> >> > > existing queue-size metric in ClientQuotaManager incremented?
> >> > >
> >> > > - In the interest of providing guide line for admin to decide
> >> > > io-thread-unit-based quota and for user to understand its impact on
> >> their
> >> > > traffic, would it be useful to have a metric that shows the overall
> >> > > byte-rate per io-thread-unit? Can we also show this a per-clientId
> >> > metric?
> >> > >
> >> > > Thanks,
> >> > > Dong
> >> > >
> >> > >
> >> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io> wrote:
> >> > >
> >> > >> Hi, Ismael,
> >> > >>
> >> > >> For #3, typically, an admin won't configure more io threads than
> CPU
> >> > >> cores,
> >> > >> but it's possible for an admin to start with fewer io threads than
> >> cores
> >> > >> and grow that later on.
> >> > >>
> >> > >> Hi, Dong,
> >> > >>
> >> > >> I think the throttleTime sensor on the broker tells the admin
> >> whether a
> >> > >> user/clentId is throttled or not.
> >> > >>
> >> > >> Hi, Radi,
> >> > >>
> >> > >> The reasoning for delaying the throttled requests on the broker
> >> instead
> >> > of
> >> > >> returning an error immediately is that the latter has no way to
> >> prevent
> >> > >> the
> >> > >> client from retrying immediately, which will make things worse. The
> >> > >> delaying logic is based off a delay queue. A separate expiration
> >> thread
> >> > >> just waits on the next to be expired request. So, it doesn't tie
> up a
> >> > >> request handler thread.
> >> > >>
> >> > >> Thanks,
> >> > >>
> >> > >> Jun
> >> > >>
> >> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <is...@juma.me.uk>
> >> wrote:
> >> > >>
> >> > >> > Hi Jay,
> >> > >> >
> >> > >> > Regarding 1, I definitely like the simplicity of keeping a single
> >> > >> throttle
> >> > >> > time field in the response. The downside is that the client
> metrics
> >> > >> will be
> >> > >> > more coarse grained.
> >> > >> >
> >> > >> > Regarding 3, we have `leader.imbalance.per.broker.percentage`
> and
> >> > >> > `log.cleaner.min.cleanable.ratio`.
> >> > >> >
> >> > >> > Ismael
> >> > >> >
> >> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io>
> >> wrote:
> >> > >> >
> >> > >> > > A few minor comments:
> >> > >> > >
> >> > >> > >    1. Isn't it the case that the throttling time response field
> >> > should
> >> > >> > have
> >> > >> > >    the total time your request was throttled irrespective of
> the
> >> > >> quotas
> >> > >> > > that
> >> > >> > >    caused that. Limiting it to byte rate quota doesn't make
> >> sense,
> >> > >> but I
> >> > >> > > also
> >> > >> > >    I don't think we want to end up adding new fields in the
> >> response
> >> > >> for
> >> > >> > > every
> >> > >> > >    single thing we quota, right?
> >> > >> > >    2. I don't think we should make this quota specifically
> about
> >> io
> >> > >> > >    threads. Once we introduce these quotas people set them and
> >> > expect
> >> > >> > them
> >> > >> > > to
> >> > >> > >    be enforced (and if they aren't it may cause an outage). As
> a
> >> > >> result
> >> > >> > > they
> >> > >> > >    are a bit more sensitive than normal configs, I think. The
> >> > current
> >> > >> > > thread
> >> > >> > >    pools seem like something of an implementation detail and
> not
> >> the
> >> > >> > level
> >> > >> > > the
> >> > >> > >    user-facing quotas should be involved with. I think it might
> >> be
> >> > >> better
> >> > >> > > to
> >> > >> > >    make this a general request-time throttle with no mention in
> >> the
> >> > >> > naming
> >> > >> > >    about I/O threads and simply acknowledge the current
> >> limitation
> >> > >> (which
> >> > >> > > we
> >> > >> > >    may someday fix) in the docs that this covers only the time
> >> after
> >> > >> the
> >> > >> > >    thread is read off the network.
> >> > >> > >    3. As such I think the right interface to the user would be
> >> > >> something
> >> > >> > >    like percent_request_time and be in {0,...100} or
> >> > >> request_time_ratio
> >> > >> > > and be
> >> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we used
> >> if
> >> > the
> >> > >> > > scale
> >> > >> > >    is between 0 and 1 in the other metrics, right?)
> >> > >> > >
> >> > >> > > -Jay
> >> > >> > >
> >> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> >> > >> rajinisivaram@gmail.com
> >> > >> > >
> >> > >> > > wrote:
> >> > >> > >
> >> > >> > > > Guozhang/Dong,
> >> > >> > > >
> >> > >> > > > Thank you for the feedback.
> >> > >> > > >
> >> > >> > > > Guozhang : I have updated the section on co-existence of byte
> >> rate
> >> > >> and
> >> > >> > > > request time quotas.
> >> > >> > > >
> >> > >> > > > Dong: I hadn't added much detail to the metrics and sensors
> >> since
> >> > >> they
> >> > >> > > are
> >> > >> > > > going to be very similar to the existing metrics and sensors.
> >> To
> >> > >> avoid
> >> > >> > > > confusion, I have now added more detail. All metrics are in
> the
> >> > >> group
> >> > >> > > > "quotaType" and all sensors have names starting with
> >> "quotaType"
> >> > >> (where
> >> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> >> > >> > > > FollowerReplication/*IOThread*).
> >> > >> > > > So there will be no reuse of existing metrics/sensors. The
> new
> >> > ones
> >> > >> for
> >> > >> > > > request processing time based throttling will be completely
> >> > >> independent
> >> > >> > > of
> >> > >> > > > existing metrics/sensors, but will be consistent in format.
> >> > >> > > >
> >> > >> > > > The existing throttle_time_ms field in produce/fetch
> responses
> >> > will
> >> > >> not
> >> > >> > > be
> >> > >> > > > impacted by this KIP. That will continue to return byte-rate
> >> based
> >> > >> > > > throttling times. In addition, a new field
> >> > request_throttle_time_ms
> >> > >> > will
> >> > >> > > be
> >> > >> > > > added to return request quota based throttling times. These
> >> will
> >> > be
> >> > >> > > exposed
> >> > >> > > > as new metrics on the client-side.
> >> > >> > > >
> >> > >> > > > Since all metrics and sensors are different for each type of
> >> > quota,
> >> > >> I
> >> > >> > > > believe there is already sufficient metrics to monitor
> >> throttling
> >> > on
> >> > >> > both
> >> > >> > > > client and broker side for each type of throttling.
> >> > >> > > >
> >> > >> > > > Regards,
> >> > >> > > >
> >> > >> > > > Rajini
> >> > >> > > >
> >> > >> > > >
> >> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <
> lindong28@gmail.com
> >> >
> >> > >> wrote:
> >> > >> > > >
> >> > >> > > > > Hey Rajini,
> >> > >> > > > >
> >> > >> > > > > I think it makes a lot of sense to use io_thread_units as
> >> metric
> >> > >> to
> >> > >> > > quota
> >> > >> > > > > user's traffic here. LGTM overall. I have some questions
> >> > regarding
> >> > >> > > > sensors.
> >> > >> > > > >
> >> > >> > > > > - Can you be more specific in the KIP what sensors will be
> >> > added?
> >> > >> For
> >> > >> > > > > example, it will be useful to specify the name and
> >> attributes of
> >> > >> > these
> >> > >> > > > new
> >> > >> > > > > sensors.
> >> > >> > > > >
> >> > >> > > > > - We currently have throttle-time and queue-size for
> >> byte-rate
> >> > >> based
> >> > >> > > > quota.
> >> > >> > > > > Are you going to have separate throttle-time and queue-size
> >> for
> >> > >> > > requests
> >> > >> > > > > throttled by io_thread_unit-based quota, or will they share
> >> the
> >> > >> same
> >> > >> > > > > sensor?
> >> > >> > > > >
> >> > >> > > > > - Does the throttle-time in the ProduceResponse and
> >> > FetchResponse
> >> > >> > > > contains
> >> > >> > > > > time due to io_thread_unit-based quota?
> >> > >> > > > >
> >> > >> > > > > - Currently kafka server doesn't not provide any log or
> >> metrics
> >> > >> that
> >> > >> > > > tells
> >> > >> > > > > whether any given clientId (or user) is throttled. This is
> >> not
> >> > too
> >> > >> > bad
> >> > >> > > > > because we can still check the client-side byte-rate metric
> >> to
> >> > >> > validate
> >> > >> > > > > whether a given client is throttled. But with this
> >> > io_thread_unit,
> >> > >> > > there
> >> > >> > > > > will be no way to validate whether a given client is slow
> >> > because
> >> > >> it
> >> > >> > > has
> >> > >> > > > > exceeded its io_thread_unit limit. It is necessary for user
> >> to
> >> > be
> >> > >> > able
> >> > >> > > to
> >> > >> > > > > know this information to figure how whether they have
> reached
> >> > >> there
> >> > >> > > quota
> >> > >> > > > > limit. How about we add log4j log on the server side to
> >> > >> periodically
> >> > >> > > > print
> >> > >> > > > > the (client_id, byte-rate-throttle-time,
> >> > >> > io-thread-unit-throttle-time)
> >> > >> > > so
> >> > >> > > > > that kafka administrator can figure those users that have
> >> > reached
> >> > >> > their
> >> > >> > > > > limit and act accordingly?
> >> > >> > > > >
> >> > >> > > > > Thanks,
> >> > >> > > > > Dong
> >> > >> > > > >
> >> > >> > > > >
> >> > >> > > > >
> >> > >> > > > >
> >> > >> > > > >
> >> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> >> > >> wangguoz@gmail.com>
> >> > >> > > > wrote:
> >> > >> > > > >
> >> > >> > > > > > Made a pass over the doc, overall LGTM except a minor
> >> comment
> >> > on
> >> > >> > the
> >> > >> > > > > > throttling implementation:
> >> > >> > > > > >
> >> > >> > > > > > Stated as "Request processing time throttling will be
> >> applied
> >> > on
> >> > >> > top
> >> > >> > > if
> >> > >> > > > > > necessary." I thought that it meant the request
> processing
> >> > time
> >> > >> > > > > throttling
> >> > >> > > > > > is applied first, but continue reading I found it
> actually
> >> > >> meant to
> >> > >> > > > apply
> >> > >> > > > > > produce / fetch byte rate throttling first.
> >> > >> > > > > >
> >> > >> > > > > > Also the last sentence "The remaining delay if any is
> >> applied
> >> > to
> >> > >> > the
> >> > >> > > > > > response." is a bit confusing to me. Maybe rewording it a
> >> bit?
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > > Guozhang
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <
> jun@confluent.io
> >> >
> >> > >> wrote:
> >> > >> > > > > >
> >> > >> > > > > > > Hi, Rajini,
> >> > >> > > > > > >
> >> > >> > > > > > > Thanks for the updated KIP. The latest proposal looks
> >> good
> >> > to
> >> > >> me.
> >> > >> > > > > > >
> >> > >> > > > > > > Jun
> >> > >> > > > > > >
> >> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> >> > >> > > > > rajinisivaram@gmail.com
> >> > >> > > > > > >
> >> > >> > > > > > > wrote:
> >> > >> > > > > > >
> >> > >> > > > > > > > Jun/Roger,
> >> > >> > > > > > > >
> >> > >> > > > > > > > Thank you for the feedback.
> >> > >> > > > > > > >
> >> > >> > > > > > > > 1. I have updated the KIP to use absolute units
> >> instead of
> >> > >> > > > > percentage.
> >> > >> > > > > > > The
> >> > >> > > > > > > > property is called* io_thread_units* to align with
> the
> >> > >> thread
> >> > >> > > count
> >> > >> > > > > > > > property *num.io.threads*. When we implement network
> >> > thread
> >> > >> > > > > utilization
> >> > >> > > > > > > > quotas, we can add another property
> >> > *network_thread_units.*
> >> > >> > > > > > > >
> >> > >> > > > > > > > 2. ControlledShutdown is already listed under the
> >> exempt
> >> > >> > > requests.
> >> > >> > > > > Jun,
> >> > >> > > > > > > did
> >> > >> > > > > > > > you mean a different request that needs to be added?
> >> The
> >> > >> four
> >> > >> > > > > requests
> >> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> >> > >> > ControlledShutdown,
> >> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are controlled
> >> > using
> >> > >> > > > > > ClusterAction
> >> > >> > > > > > > > ACL, so it is easy to exclude and only throttle if
> >> > >> > unauthorized.
> >> > >> > > I
> >> > >> > > > > > wasn't
> >> > >> > > > > > > > sure if there are other requests used only for
> >> > inter-broker
> >> > >> > that
> >> > >> > > > > needed
> >> > >> > > > > > > to
> >> > >> > > > > > > > be excluded.
> >> > >> > > > > > > >
> >> > >> > > > > > > > 3. I was thinking the smallest change would be to
> >> replace
> >> > >> all
> >> > >> > > > > > references
> >> > >> > > > > > > to
> >> > >> > > > > > > > *requestChannel.sendResponse()* with a local method
> >> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the
> throttling
> >> if
> >> > >> any
> >> > >> > > plus
> >> > >> > > > > send
> >> > >> > > > > > > > response. If we throttle first in
> *KafkaApis.handle()*,
> >> > the
> >> > >> > time
> >> > >> > > > > spent
> >> > >> > > > > > > > within the method handling the request will not be
> >> > recorded
> >> > >> or
> >> > >> > > used
> >> > >> > > > > in
> >> > >> > > > > > > > throttling. We can look into this again when the PR
> is
> >> > ready
> >> > >> > for
> >> > >> > > > > > review.
> >> > >> > > > > > > >
> >> > >> > > > > > > > Regards,
> >> > >> > > > > > > >
> >> > >> > > > > > > > Rajini
> >> > >> > > > > > > >
> >> > >> > > > > > > >
> >> > >> > > > > > > >
> >> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> >> > >> > > > > roger.hoover@gmail.com>
> >> > >> > > > > > > > wrote:
> >> > >> > > > > > > >
> >> > >> > > > > > > > > Great to see this KIP and the excellent discussion.
> >> > >> > > > > > > > >
> >> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
> >> application
> >> > is
> >> > >> > > > > allocated
> >> > >> > > > > > 1
> >> > >> > > > > > > > > request handler unit, then it's as if I have a
> Kafka
> >> > >> broker
> >> > >> > > with
> >> > >> > > > a
> >> > >> > > > > > > single
> >> > >> > > > > > > > > request handler thread dedicated to me.  That's the
> >> > most I
> >> > >> > can
> >> > >> > > > use,
> >> > >> > > > > > at
> >> > >> > > > > > > > > least.  That allocation doesn't change even if an
> >> admin
> >> > >> later
> >> > >> > > > > > increases
> >> > >> > > > > > > > the
> >> > >> > > > > > > > > size of the request thread pool on the broker.
> It's
> >> > >> similar
> >> > >> > to
> >> > >> > > > the
> >> > >> > > > > > CPU
> >> > >> > > > > > > > > abstraction that VMs and containers get from
> >> hypervisors
> >> > >> or
> >> > >> > OS
> >> > >> > > > > > > > schedulers.
> >> > >> > > > > > > > > While different client access patterns can use
> wildly
> >> > >> > different
> >> > >> > > > > > amounts
> >> > >> > > > > > > > of
> >> > >> > > > > > > > > request thread resources per request, a given
> >> > application
> >> > >> > will
> >> > >> > > > > > > generally
> >> > >> > > > > > > > > have a stable access pattern and can figure out
> >> > >> empirically
> >> > >> > how
> >> > >> > > > > many
> >> > >> > > > > > > > > "request thread units" it needs to meet it's
> >> > >> > throughput/latency
> >> > >> > > > > > goals.
> >> > >> > > > > > > > >
> >> > >> > > > > > > > > Cheers,
> >> > >> > > > > > > > >
> >> > >> > > > > > > > > Roger
> >> > >> > > > > > > > >
> >> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> >> > >> jun@confluent.io>
> >> > >> > > > wrote:
> >> > >> > > > > > > > >
> >> > >> > > > > > > > > > Hi, Rajini,
> >> > >> > > > > > > > > >
> >> > >> > > > > > > > > > Thanks for the updated KIP. A few more comments.
> >> > >> > > > > > > > > >
> >> > >> > > > > > > > > > 1. A concern of request_time_percent is that it's
> >> not
> >> > an
> >> > >> > > > absolute
> >> > >> > > > > > > > value.
> >> > >> > > > > > > > > > Let's say you give a user a 10% limit. If the
> admin
> >> > >> doubles
> >> > >> > > the
> >> > >> > > > > > > number
> >> > >> > > > > > > > of
> >> > >> > > > > > > > > > request handler threads, that user now actually
> has
> >> > >> twice
> >> > >> > the
> >> > >> > > > > > > absolute
> >> > >> > > > > > > > > > capacity. This may confuse people a bit. So,
> >> perhaps
> >> > >> > setting
> >> > >> > > > the
> >> > >> > > > > > > quota
> >> > >> > > > > > > > > > based on an absolute request thread unit is
> better.
> >> > >> > > > > > > > > >
> >> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
> >> inter-broker
> >> > >> > request
> >> > >> > > > and
> >> > >> > > > > > > needs
> >> > >> > > > > > > > to
> >> > >> > > > > > > > > > be excluded from throttling.
> >> > >> > > > > > > > > >
> >> > >> > > > > > > > > > 3. Implementation wise, I am wondering if it's
> >> simpler
> >> > >> to
> >> > >> > > apply
> >> > >> > > > > the
> >> > >> > > > > > > > > request
> >> > >> > > > > > > > > > time throttling first in KafkaApis.handle().
> >> > Otherwise,
> >> > >> we
> >> > >> > > will
> >> > >> > > > > > need
> >> > >> > > > > > > to
> >> > >> > > > > > > > > add
> >> > >> > > > > > > > > > the throttling logic in each type of request.
> >> > >> > > > > > > > > >
> >> > >> > > > > > > > > > Thanks,
> >> > >> > > > > > > > > >
> >> > >> > > > > > > > > > Jun
> >> > >> > > > > > > > > >
> >> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> >> > >> > > > > > > > rajinisivaram@gmail.com
> >> > >> > > > > > > > > >
> >> > >> > > > > > > > > > wrote:
> >> > >> > > > > > > > > >
> >> > >> > > > > > > > > > > Jun,
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > > > Thank you for the review.
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > > > I have reverted to the original KIP that
> >> throttles
> >> > >> based
> >> > >> > on
> >> > >> > > > > > request
> >> > >> > > > > > > > > > handler
> >> > >> > > > > > > > > > > utilization. At the moment, it uses percentage,
> >> but
> >> > I
> >> > >> am
> >> > >> > > > happy
> >> > >> > > > > to
> >> > >> > > > > > > > > change
> >> > >> > > > > > > > > > to
> >> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
> >> required. I
> >> > >> have
> >> > >> > > > added
> >> > >> > > > > > the
> >> > >> > > > > > > > > > examples
> >> > >> > > > > > > > > > > from this discussion to the KIP. Also added a
> >> > "Future
> >> > >> > Work"
> >> > >> > > > > > section
> >> > >> > > > > > > > to
> >> > >> > > > > > > > > > > address network thread utilization. The
> >> > configuration
> >> > >> is
> >> > >> > > > named
> >> > >> > > > > > > > > > > "request_time_percent" with the expectation
> that
> >> it
> >> > >> can
> >> > >> > > also
> >> > >> > > > be
> >> > >> > > > > > > used
> >> > >> > > > > > > > as
> >> > >> > > > > > > > > > the
> >> > >> > > > > > > > > > > limit for network thread utilization when that
> is
> >> > >> > > > implemented,
> >> > >> > > > > so
> >> > >> > > > > > > > that
> >> > >> > > > > > > > > > > users have to set only one config for the two
> and
> >> > not
> >> > >> > have
> >> > >> > > to
> >> > >> > > > > > worry
> >> > >> > > > > > > > > about
> >> > >> > > > > > > > > > > the internal distribution of the work between
> the
> >> > two
> >> > >> > > thread
> >> > >> > > > > > pools
> >> > >> > > > > > > in
> >> > >> > > > > > > > > > > Kafka.
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > > > Regards,
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > > > Rajini
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
> >> > >> > > jun@confluent.io>
> >> > >> > > > > > > wrote:
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > > > > Hi, Rajini,
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > > Thanks for the proposal.
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > > The benefit of using the request processing
> >> time
> >> > >> over
> >> > >> > the
> >> > >> > > > > > request
> >> > >> > > > > > > > > rate
> >> > >> > > > > > > > > > is
> >> > >> > > > > > > > > > > > exactly what people have said. I will just
> >> expand
> >> > >> that
> >> > >> > a
> >> > >> > > > bit.
> >> > >> > > > > > > > > Consider
> >> > >> > > > > > > > > > > the
> >> > >> > > > > > > > > > > > following case. The producer sends a produce
> >> > request
> >> > >> > > with a
> >> > >> > > > > > 10MB
> >> > >> > > > > > > > > > message
> >> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> >> > >> decompression of
> >> > >> > > the
> >> > >> > > > > > > message
> >> > >> > > > > > > > > on
> >> > >> > > > > > > > > > > the
> >> > >> > > > > > > > > > > > broker could take 10-15 seconds, during which
> >> > time,
> >> > >> a
> >> > >> > > > request
> >> > >> > > > > > > > handler
> >> > >> > > > > > > > > > > > thread is completely blocked. In this case,
> >> > neither
> >> > >> the
> >> > >> > > > > byte-in
> >> > >> > > > > > > > quota
> >> > >> > > > > > > > > > nor
> >> > >> > > > > > > > > > > > the request rate quota may be effective in
> >> > >> protecting
> >> > >> > the
> >> > >> > > > > > broker.
> >> > >> > > > > > > > > > > Consider
> >> > >> > > > > > > > > > > > another case. A consumer group starts with 10
> >> > >> instances
> >> > >> > > and
> >> > >> > > > > > later
> >> > >> > > > > > > > on
> >> > >> > > > > > > > > > > > switches to 20 instances. The request rate
> will
> >> > >> likely
> >> > >> > > > > double,
> >> > >> > > > > > > but
> >> > >> > > > > > > > > the
> >> > >> > > > > > > > > > > > actually load on the broker may not double
> >> since
> >> > >> each
> >> > >> > > fetch
> >> > >> > > > > > > request
> >> > >> > > > > > > > > > only
> >> > >> > > > > > > > > > > > contains half of the partitions. Request rate
> >> > quota
> >> > >> may
> >> > >> > > not
> >> > >> > > > > be
> >> > >> > > > > > > easy
> >> > >> > > > > > > > > to
> >> > >> > > > > > > > > > > > configure in this case.
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > > What we really want is to be able to prevent
> a
> >> > >> client
> >> > >> > > from
> >> > >> > > > > > using
> >> > >> > > > > > > > too
> >> > >> > > > > > > > > > much
> >> > >> > > > > > > > > > > > of the server side resources. In this
> >> particular
> >> > >> KIP,
> >> > >> > > this
> >> > >> > > > > > > resource
> >> > >> > > > > > > > > is
> >> > >> > > > > > > > > > > the
> >> > >> > > > > > > > > > > > capacity of the request handler threads. I
> >> agree
> >> > >> that
> >> > >> > it
> >> > >> > > > may
> >> > >> > > > > > not
> >> > >> > > > > > > be
> >> > >> > > > > > > > > > > > intuitive for the users to determine how to
> set
> >> > the
> >> > >> > right
> >> > >> > > > > > limit.
> >> > >> > > > > > > > > > However,
> >> > >> > > > > > > > > > > > this is not completely new and has been done
> in
> >> > the
> >> > >> > > > container
> >> > >> > > > > > > world
> >> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> >> > >> > > > > https://access.redhat.com/
> >> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> >> > >> terprise_Linux/6/html/
> >> > >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html) has
> >> the
> >> > >> > concept
> >> > >> > > of
> >> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> >> > >> > > > > > > > > > > > which specifies the total amount of time in
> >> > >> > microseconds
> >> > >> > > > for
> >> > >> > > > > > > which
> >> > >> > > > > > > > > all
> >> > >> > > > > > > > > > > > tasks in a cgroup can run during a one second
> >> > >> period.
> >> > >> > We
> >> > >> > > > can
> >> > >> > > > > > > > > > potentially
> >> > >> > > > > > > > > > > > model the request handler threads in a
> similar
> >> > way.
> >> > >> For
> >> > >> > > > > > example,
> >> > >> > > > > > > > each
> >> > >> > > > > > > > > > > > request handler thread can be 1 request
> handler
> >> > unit
> >> > >> > and
> >> > >> > > > the
> >> > >> > > > > > > admin
> >> > >> > > > > > > > > can
> >> > >> > > > > > > > > > > > configure a limit on how many units (say
> 0.01)
> >> a
> >> > >> client
> >> > >> > > can
> >> > >> > > > > > have.
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > > Regarding not throttling the internal broker
> to
> >> > >> broker
> >> > >> > > > > > requests.
> >> > >> > > > > > > We
> >> > >> > > > > > > > > > could
> >> > >> > > > > > > > > > > > do that. Alternatively, we could just let the
> >> > admin
> >> > >> > > > > configure a
> >> > >> > > > > > > > high
> >> > >> > > > > > > > > > > limit
> >> > >> > > > > > > > > > > > for the kafka user (it may not be able to do
> >> that
> >> > >> > easily
> >> > >> > > > > based
> >> > >> > > > > > on
> >> > >> > > > > > > > > > > clientId
> >> > >> > > > > > > > > > > > though).
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > > Ideally we want to be able to protect the
> >> > >> utilization
> >> > >> > of
> >> > >> > > > the
> >> > >> > > > > > > > network
> >> > >> > > > > > > > > > > thread
> >> > >> > > > > > > > > > > > pool too. The difficult is mostly what Rajini
> >> > said:
> >> > >> (1)
> >> > >> > > The
> >> > >> > > > > > > > mechanism
> >> > >> > > > > > > > > > for
> >> > >> > > > > > > > > > > > throttling the requests is through Purgatory
> >> and
> >> > we
> >> > >> > will
> >> > >> > > > have
> >> > >> > > > > > to
> >> > >> > > > > > > > > think
> >> > >> > > > > > > > > > > > through how to integrate that into the
> network
> >> > >> layer.
> >> > >> > > (2)
> >> > >> > > > In
> >> > >> > > > > > the
> >> > >> > > > > > > > > > network
> >> > >> > > > > > > > > > > > layer, currently we know the user, but not
> the
> >> > >> clientId
> >> > >> > > of
> >> > >> > > > > the
> >> > >> > > > > > > > > request.
> >> > >> > > > > > > > > > > So,
> >> > >> > > > > > > > > > > > it's a bit tricky to throttle based on
> clientId
> >> > >> there.
> >> > >> > > > Plus,
> >> > >> > > > > > the
> >> > >> > > > > > > > > > byteOut
> >> > >> > > > > > > > > > > > quota can already protect the network thread
> >> > >> > utilization
> >> > >> > > > for
> >> > >> > > > > > > fetch
> >> > >> > > > > > > > > > > > requests. So, if we can't figure out this
> part
> >> > right
> >> > >> > now,
> >> > >> > > > > just
> >> > >> > > > > > > > > focusing
> >> > >> > > > > > > > > > > on
> >> > >> > > > > > > > > > > > the request handling threads for this KIP is
> >> > still a
> >> > >> > > useful
> >> > >> > > > > > > > feature.
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > > Thanks,
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > > Jun
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini
> >> Sivaram <
> >> > >> > > > > > > > > > rajinisivaram@gmail.com
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > > wrote:
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > Thank you all for the feedback.
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > Jay: I have removed exemption for consumer
> >> > >> heartbeat
> >> > >> > > etc.
> >> > >> > > > > > Agree
> >> > >> > > > > > > > > that
> >> > >> > > > > > > > > > > > > protecting the cluster is more important
> than
> >> > >> > > protecting
> >> > >> > > > > > > > individual
> >> > >> > > > > > > > > > > apps.
> >> > >> > > > > > > > > > > > > Have retained the exemption for
> >> > >> > > StopReplicat/LeaderAndIsr
> >> > >> > > > > > etc,
> >> > >> > > > > > > > > these
> >> > >> > > > > > > > > > > are
> >> > >> > > > > > > > > > > > > throttled only if authorization fails (so
> >> can't
> >> > be
> >> > >> > used
> >> > >> > > > for
> >> > >> > > > > > DoS
> >> > >> > > > > > > > > > attacks
> >> > >> > > > > > > > > > > > in
> >> > >> > > > > > > > > > > > > a secure cluster, but allows inter-broker
> >> > >> requests to
> >> > >> > > > > > complete
> >> > >> > > > > > > > > > without
> >> > >> > > > > > > > > > > > > delays).
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > I will wait another day to see if these is
> >> any
> >> > >> > > objection
> >> > >> > > > to
> >> > >> > > > > > > > quotas
> >> > >> > > > > > > > > > > based
> >> > >> > > > > > > > > > > > on
> >> > >> > > > > > > > > > > > > request processing time (as opposed to
> >> request
> >> > >> rate)
> >> > >> > > and
> >> > >> > > > if
> >> > >> > > > > > > there
> >> > >> > > > > > > > > are
> >> > >> > > > > > > > > > > no
> >> > >> > > > > > > > > > > > > objections, I will revert to the original
> >> > proposal
> >> > >> > with
> >> > >> > > > > some
> >> > >> > > > > > > > > changes.
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > The original proposal was only including
> the
> >> > time
> >> > >> > used
> >> > >> > > by
> >> > >> > > > > the
> >> > >> > > > > > > > > request
> >> > >> > > > > > > > > > > > > handler threads (that made calculation
> >> easy). I
> >> > >> think
> >> > >> > > the
> >> > >> > > > > > > > > suggestion
> >> > >> > > > > > > > > > is
> >> > >> > > > > > > > > > > > to
> >> > >> > > > > > > > > > > > > include the time spent in the network
> >> threads as
> >> > >> well
> >> > >> > > > since
> >> > >> > > > > > > that
> >> > >> > > > > > > > > may
> >> > >> > > > > > > > > > be
> >> > >> > > > > > > > > > > > > significant. As Jay pointed out, it is more
> >> > >> > complicated
> >> > >> > > > to
> >> > >> > > > > > > > > calculate
> >> > >> > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > total available CPU time and convert to a
> >> ratio
> >> > >> when
> >> > >> > > > there
> >> > >> > > > > > *m*
> >> > >> > > > > > > > I/O
> >> > >> > > > > > > > > > > > threads
> >> > >> > > > > > > > > > > > > and *n* network threads.
> >> > >> > ThreadMXBean#getThreadCPUTime(
> >> > >> > > )
> >> > >> > > > > may
> >> > >> > > > > > > > give
> >> > >> > > > > > > > > us
> >> > >> > > > > > > > > > > > what
> >> > >> > > > > > > > > > > > > we want, but it can be very expensive on
> some
> >> > >> > > platforms.
> >> > >> > > > As
> >> > >> > > > > > > > Becket
> >> > >> > > > > > > > > > and
> >> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have
> several
> >> > time
> >> > >> > > > > > measurements
> >> > >> > > > > > > > > > already
> >> > >> > > > > > > > > > > > for
> >> > >> > > > > > > > > > > > > generating metrics that we could use,
> though
> >> we
> >> > >> might
> >> > >> > > > want
> >> > >> > > > > to
> >> > >> > > > > > > > > switch
> >> > >> > > > > > > > > > to
> >> > >> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis()
> >> since
> >> > >> some
> >> > >> > of
> >> > >> > > > the
> >> > >> > > > > > > > values
> >> > >> > > > > > > > > > for
> >> > >> > > > > > > > > > > > > small requests may be < 1ms. But rather
> than
> >> add
> >> > >> up
> >> > >> > the
> >> > >> > > > > time
> >> > >> > > > > > > > spent
> >> > >> > > > > > > > > in
> >> > >> > > > > > > > > > > I/O
> >> > >> > > > > > > > > > > > > thread and network thread, wouldn't it be
> >> better
> >> > >> to
> >> > >> > > > convert
> >> > >> > > > > > the
> >> > >> > > > > > > > > time
> >> > >> > > > > > > > > > > > spent
> >> > >> > > > > > > > > > > > > on each thread into a separate ratio? UserA
> >> has
> >> > a
> >> > >> > > request
> >> > >> > > > > > quota
> >> > >> > > > > > > > of
> >> > >> > > > > > > > > > 5%.
> >> > >> > > > > > > > > > > > Can
> >> > >> > > > > > > > > > > > > we take that to mean that UserA can use 5%
> of
> >> > the
> >> > >> > time
> >> > >> > > on
> >> > >> > > > > > > network
> >> > >> > > > > > > > > > > threads
> >> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If
> either
> >> is
> >> > >> > > exceeded,
> >> > >> > > > > the
> >> > >> > > > > > > > > > response
> >> > >> > > > > > > > > > > is
> >> > >> > > > > > > > > > > > > throttled - it would mean maintaining two
> >> sets
> >> > of
> >> > >> > > metrics
> >> > >> > > > > for
> >> > >> > > > > > > the
> >> > >> > > > > > > > > two
> >> > >> > > > > > > > > > > > > durations, but would result in more
> >> meaningful
> >> > >> > ratios.
> >> > >> > > We
> >> > >> > > > > > could
> >> > >> > > > > > > > > > define
> >> > >> > > > > > > > > > > > two
> >> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request
> threads
> >> > and
> >> > >> 10%
> >> > >> > > of
> >> > >> > > > > > > network
> >> > >> > > > > > > > > > > > threads),
> >> > >> > > > > > > > > > > > > but that seems unnecessary and harder to
> >> explain
> >> > >> to
> >> > >> > > > users.
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > Back to why and how quotas are applied to
> >> > network
> >> > >> > > thread
> >> > >> > > > > > > > > utilization:
> >> > >> > > > > > > > > > > > > a) In the case of fetch,  the time spent in
> >> the
> >> > >> > network
> >> > >> > > > > > thread
> >> > >> > > > > > > > may
> >> > >> > > > > > > > > be
> >> > >> > > > > > > > > > > > > significant and I can see the need to
> include
> >> > >> this.
> >> > >> > Are
> >> > >> > > > > there
> >> > >> > > > > > > > other
> >> > >> > > > > > > > > > > > > requests where the network thread
> >> utilization is
> >> > >> > > > > significant?
> >> > >> > > > > > > In
> >> > >> > > > > > > > > the
> >> > >> > > > > > > > > > > case
> >> > >> > > > > > > > > > > > > of fetch, request handler thread
> utilization
> >> > would
> >> > >> > > > throttle
> >> > >> > > > > > > > clients
> >> > >> > > > > > > > > > > with
> >> > >> > > > > > > > > > > > > high request rate, low data volume and
> fetch
> >> > byte
> >> > >> > rate
> >> > >> > > > > quota
> >> > >> > > > > > > will
> >> > >> > > > > > > > > > > > throttle
> >> > >> > > > > > > > > > > > > clients with high data volume. Network
> thread
> >> > >> > > utilization
> >> > >> > > > > is
> >> > >> > > > > > > > > perhaps
> >> > >> > > > > > > > > > > > > proportional to the data volume. I am
> >> wondering
> >> > >> if we
> >> > >> > > > even
> >> > >> > > > > > need
> >> > >> > > > > > > > to
> >> > >> > > > > > > > > > > > throttle
> >> > >> > > > > > > > > > > > > based on network thread utilization or
> >> whether
> >> > the
> >> > >> > data
> >> > >> > > > > > volume
> >> > >> > > > > > > > > quota
> >> > >> > > > > > > > > > > > covers
> >> > >> > > > > > > > > > > > > this case.
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > b) At the moment, we record and check for
> >> quota
> >> > >> > > violation
> >> > >> > > > > at
> >> > >> > > > > > > the
> >> > >> > > > > > > > > same
> >> > >> > > > > > > > > > > > time.
> >> > >> > > > > > > > > > > > > If a quota is violated, the response is
> >> delayed.
> >> > >> > Using
> >> > >> > > > > Jay'e
> >> > >> > > > > > > > > example
> >> > >> > > > > > > > > > of
> >> > >> > > > > > > > > > > > > disk reads for fetches happening in the
> >> network
> >> > >> > thread,
> >> > >> > > > We
> >> > >> > > > > > > can't
> >> > >> > > > > > > > > > record
> >> > >> > > > > > > > > > > > and
> >> > >> > > > > > > > > > > > > delay a response after the disk reads. We
> >> could
> >> > >> > record
> >> > >> > > > the
> >> > >> > > > > > time
> >> > >> > > > > > > > > spent
> >> > >> > > > > > > > > > > on
> >> > >> > > > > > > > > > > > > the network thread when the response is
> >> complete
> >> > >> and
> >> > >> > > > > > introduce
> >> > >> > > > > > > a
> >> > >> > > > > > > > > > delay
> >> > >> > > > > > > > > > > > for
> >> > >> > > > > > > > > > > > > handling a subsequent request (separate out
> >> > >> recording
> >> > >> > > and
> >> > >> > > > > > quota
> >> > >> > > > > > > > > > > violation
> >> > >> > > > > > > > > > > > > handling in the case of network thread
> >> > overload).
> >> > >> > Does
> >> > >> > > > that
> >> > >> > > > > > > make
> >> > >> > > > > > > > > > sense?
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > Regards,
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > Rajini
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket
> Qin <
> >> > >> > > > > > > > becket.qin@gmail.com>
> >> > >> > > > > > > > > > > > wrote:
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > Hey Jay,
> >> > >> > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU time
> >> is a
> >> > >> > little
> >> > >> > > > > > > tricky. I
> >> > >> > > > > > > > > am
> >> > >> > > > > > > > > > > > > thinking
> >> > >> > > > > > > > > > > > > > that maybe we can use the existing
> request
> >> > >> > > statistics.
> >> > >> > > > > They
> >> > >> > > > > > > are
> >> > >> > > > > > > > > > > already
> >> > >> > > > > > > > > > > > > > very detailed so we can probably see the
> >> > >> > approximate
> >> > >> > > > CPU
> >> > >> > > > > > time
> >> > >> > > > > > > > > from
> >> > >> > > > > > > > > > > it,
> >> > >> > > > > > > > > > > > > e.g.
> >> > >> > > > > > > > > > > > > > something like (total_time -
> >> > >> > > > request/response_queue_time
> >> > >> > > > > -
> >> > >> > > > > > > > > > > > remote_time).
> >> > >> > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > I agree with Guozhang that when a user is
> >> > >> throttled
> >> > >> > > it
> >> > >> > > > is
> >> > >> > > > > > > > likely
> >> > >> > > > > > > > > > that
> >> > >> > > > > > > > > > > > we
> >> > >> > > > > > > > > > > > > > need to see if anything has went wrong
> >> first,
> >> > >> and
> >> > >> > if
> >> > >> > > > the
> >> > >> > > > > > > users
> >> > >> > > > > > > > > are
> >> > >> > > > > > > > > > > well
> >> > >> > > > > > > > > > > > > > behaving and just need more resources, we
> >> will
> >> > >> have
> >> > >> > > to
> >> > >> > > > > bump
> >> > >> > > > > > > up
> >> > >> > > > > > > > > the
> >> > >> > > > > > > > > > > > quota
> >> > >> > > > > > > > > > > > > > for them. It is true that pre-allocating
> >> CPU
> >> > >> time
> >> > >> > > quota
> >> > >> > > > > > > > precisely
> >> > >> > > > > > > > > > for
> >> > >> > > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > > users is difficult. So in practice it
> would
> >> > >> > probably
> >> > >> > > be
> >> > >> > > > > > more
> >> > >> > > > > > > > like
> >> > >> > > > > > > > > > > first
> >> > >> > > > > > > > > > > > > set
> >> > >> > > > > > > > > > > > > > a relative high protective CPU time quota
> >> for
> >> > >> > > everyone
> >> > >> > > > > and
> >> > >> > > > > > > > > increase
> >> > >> > > > > > > > > > > > that
> >> > >> > > > > > > > > > > > > > for some individual clients on demand.
> >> > >> > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > Thanks,
> >> > >> > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> >> > >> > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang
> >> > Wang <
> >> > >> > > > > > > > > wangguoz@gmail.com
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > > > > > wrote:
> >> > >> > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > This is a great proposal, glad to see
> it
> >> > >> > happening.
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > I am inclined to the CPU throttling, or
> >> more
> >> > >> > > > > specifically
> >> > >> > > > > > > > > > > processing
> >> > >> > > > > > > > > > > > > time
> >> > >> > > > > > > > > > > > > > > ratio instead of the request rate
> >> throttling
> >> > >> as
> >> > >> > > well.
> >> > >> > > > > > > Becket
> >> > >> > > > > > > > > has
> >> > >> > > > > > > > > > > very
> >> > >> > > > > > > > > > > > > > well
> >> > >> > > > > > > > > > > > > > > summed my rationales above, and one
> >> thing to
> >> > >> add
> >> > >> > > here
> >> > >> > > > > is
> >> > >> > > > > > > that
> >> > >> > > > > > > > > the
> >> > >> > > > > > > > > > > > > former
> >> > >> > > > > > > > > > > > > > > has a good support for both "protecting
> >> > >> against
> >> > >> > > rogue
> >> > >> > > > > > > > clients"
> >> > >> > > > > > > > > as
> >> > >> > > > > > > > > > > > well
> >> > >> > > > > > > > > > > > > as
> >> > >> > > > > > > > > > > > > > > "utilizing a cluster for multi-tenancy
> >> > usage":
> >> > >> > when
> >> > >> > > > > > > thinking
> >> > >> > > > > > > > > > about
> >> > >> > > > > > > > > > > > how
> >> > >> > > > > > > > > > > > > to
> >> > >> > > > > > > > > > > > > > > explain this to the end users, I find
> it
> >> > >> actually
> >> > >> > > > more
> >> > >> > > > > > > > natural
> >> > >> > > > > > > > > > than
> >> > >> > > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > > > request rate since as mentioned above,
> >> > >> different
> >> > >> > > > > requests
> >> > >> > > > > > > > will
> >> > >> > > > > > > > > > have
> >> > >> > > > > > > > > > > > > quite
> >> > >> > > > > > > > > > > > > > > different "cost", and Kafka today
> already
> >> > have
> >> > >> > > > various
> >> > >> > > > > > > > request
> >> > >> > > > > > > > > > > types
> >> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata, etc),
> >> > >> because
> >> > >> > of
> >> > >> > > > that
> >> > >> > > > > > the
> >> > >> > > > > > > > > > request
> >> > >> > > > > > > > > > > > > rate
> >> > >> > > > > > > > > > > > > > > throttling may not be as effective
> >> unless it
> >> > >> is
> >> > >> > set
> >> > >> > > > > very
> >> > >> > > > > > > > > > > > > conservatively.
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > Regarding to user reactions when they
> are
> >> > >> > > throttled,
> >> > >> > > > I
> >> > >> > > > > > > think
> >> > >> > > > > > > > it
> >> > >> > > > > > > > > > may
> >> > >> > > > > > > > > > > > > > differ
> >> > >> > > > > > > > > > > > > > > case-by-case, and need to be
> discovered /
> >> > >> guided
> >> > >> > by
> >> > >> > > > > > looking
> >> > >> > > > > > > > at
> >> > >> > > > > > > > > > > > relative
> >> > >> > > > > > > > > > > > > > > metrics. So in other words users would
> >> not
> >> > >> expect
> >> > >> > > to
> >> > >> > > > > get
> >> > >> > > > > > > > > > additional
> >> > >> > > > > > > > > > > > > > > information by simply being told "hey,
> >> you
> >> > are
> >> > >> > > > > > throttled",
> >> > >> > > > > > > > > which
> >> > >> > > > > > > > > > is
> >> > >> > > > > > > > > > > > all
> >> > >> > > > > > > > > > > > > > > what throttling does; they need to
> take a
> >> > >> > follow-up
> >> > >> > > > > step
> >> > >> > > > > > > and
> >> > >> > > > > > > > > see
> >> > >> > > > > > > > > > > > "hmm,
> >> > >> > > > > > > > > > > > > > I'm
> >> > >> > > > > > > > > > > > > > > throttled probably because of ..",
> which
> >> is
> >> > by
> >> > >> > > > looking
> >> > >> > > > > at
> >> > >> > > > > > > > other
> >> > >> > > > > > > > > > > > metric
> >> > >> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding the
> >> > >> brokers
> >> > >> > > with
> >> > >> > > > > > > metadata
> >> > >> > > > > > > > > > > > request,
> >> > >> > > > > > > > > > > > > > > which are usually cheap to handle but
> I'm
> >> > >> sending
> >> > >> > > > > > thousands
> >> > >> > > > > > > > per
> >> > >> > > > > > > > > > > > second;
> >> > >> > > > > > > > > > > > > > or
> >> > >> > > > > > > > > > > > > > > is it because I'm catching up and hence
> >> > >> sending
> >> > >> > > very
> >> > >> > > > > > heavy
> >> > >> > > > > > > > > > fetching
> >> > >> > > > > > > > > > > > > > request
> >> > >> > > > > > > > > > > > > > > with large min.bytes, etc.
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > Regarding to the implementation, as
> once
> >> > >> > discussed
> >> > >> > > > with
> >> > >> > > > > > > Jun,
> >> > >> > > > > > > > > this
> >> > >> > > > > > > > > > > > seems
> >> > >> > > > > > > > > > > > > > not
> >> > >> > > > > > > > > > > > > > > very difficult since today we are
> already
> >> > >> > > collecting
> >> > >> > > > > the
> >> > >> > > > > > > > > "thread
> >> > >> > > > > > > > > > > pool
> >> > >> > > > > > > > > > > > > > > utilization" metrics, which is a single
> >> > >> > percentage
> >> > >> > > > > > > > > > > > "aggregateIdleMeter"
> >> > >> > > > > > > > > > > > > > > value; but we are already effectively
> >> > >> aggregating
> >> > >> > > it
> >> > >> > > > > for
> >> > >> > > > > > > each
> >> > >> > > > > > > > > > > > requests
> >> > >> > > > > > > > > > > > > in
> >> > >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just
> >> extend
> >> > >> it by
> >> > >> > > > > > recording
> >> > >> > > > > > > > the
> >> > >> > > > > > > > > > > > source
> >> > >> > > > > > > > > > > > > > > client id when handling them and
> >> aggregating
> >> > >> by
> >> > >> > > > > clientId
> >> > >> > > > > > as
> >> > >> > > > > > > > > well
> >> > >> > > > > > > > > > as
> >> > >> > > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > > > total aggregate.
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > Guozhang
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay
> >> Kreps <
> >> > >> > > > > > > jay@confluent.io
> >> > >> > > > > > > > >
> >> > >> > > > > > > > > > > wrote:
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
> >> > >> > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > When I thought about it more deeply I
> >> came
> >> > >> > around
> >> > >> > > > to
> >> > >> > > > > > the
> >> > >> > > > > > > > > > "percent
> >> > >> > > > > > > > > > > > of
> >> > >> > > > > > > > > > > > > > > > processing time" metric too. It
> seems a
> >> > lot
> >> > >> > > closer
> >> > >> > > > to
> >> > >> > > > > > the
> >> > >> > > > > > > > > thing
> >> > >> > > > > > > > > > > we
> >> > >> > > > > > > > > > > > > > > actually
> >> > >> > > > > > > > > > > > > > > > care about and need to protect. I
> also
> >> > think
> >> > >> > this
> >> > >> > > > > would
> >> > >> > > > > > > be
> >> > >> > > > > > > > a
> >> > >> > > > > > > > > > very
> >> > >> > > > > > > > > > > > > > useful
> >> > >> > > > > > > > > > > > > > > > metric even in the absence of
> >> throttling
> >> > >> just
> >> > >> > to
> >> > >> > > > > debug
> >> > >> > > > > > > > whose
> >> > >> > > > > > > > > > > using
> >> > >> > > > > > > > > > > > > > > > capacity.
> >> > >> > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > Two problems to consider:
> >> > >> > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > >    1. I agree that for the user it is
> >> > >> > > > understandable
> >> > >> > > > > > what
> >> > >> > > > > > > > > lead
> >> > >> > > > > > > > > > to
> >> > >> > > > > > > > > > > > > their
> >> > >> > > > > > > > > > > > > > > >    being throttled, but it is a bit
> >> hard
> >> > to
> >> > >> > > figure
> >> > >> > > > > out
> >> > >> > > > > > > the
> >> > >> > > > > > > > > safe
> >> > >> > > > > > > > > > > > range
> >> > >> > > > > > > > > > > > > > for
> >> > >> > > > > > > > > > > > > > > >    them. i.e. if I have a new app
> that
> >> > will
> >> > >> > send
> >> > >> > > > 200
> >> > >> > > > > > > > > > > messages/sec I
> >> > >> > > > > > > > > > > > > can
> >> > >> > > > > > > > > > > > > > > >    probably reason that I'll be under
> >> the
> >> > >> > > > throttling
> >> > >> > > > > > > limit
> >> > >> > > > > > > > of
> >> > >> > > > > > > > > > 300
> >> > >> > > > > > > > > > > > > > > req/sec.
> >> > >> > > > > > > > > > > > > > > >    However if I need to be under a
> 10%
> >> CPU
> >> > >> > > > resources
> >> > >> > > > > > > limit
> >> > >> > > > > > > > it
> >> > >> > > > > > > > > > may
> >> > >> > > > > > > > > > > > be
> >> > >> > > > > > > > > > > > > a
> >> > >> > > > > > > > > > > > > > > bit
> >> > >> > > > > > > > > > > > > > > >    harder for me to know a priori if
> i
> >> > will
> >> > >> or
> >> > >> > > > won't.
> >> > >> > > > > > > > > > > > > > > >    2. Calculating the available CPU
> >> time
> >> > is
> >> > >> a
> >> > >> > bit
> >> > >> > > > > > > difficult
> >> > >> > > > > > > > > > since
> >> > >> > > > > > > > > > > > > there
> >> > >> > > > > > > > > > > > > > > are
> >> > >> > > > > > > > > > > > > > > >    actually two thread pools--the I/O
> >> > >> threads
> >> > >> > and
> >> > >> > > > the
> >> > >> > > > > > > > network
> >> > >> > > > > > > > > > > > > threads.
> >> > >> > > > > > > > > > > > > > I
> >> > >> > > > > > > > > > > > > > > > think
> >> > >> > > > > > > > > > > > > > > >    it might be workable to count just
> >> the
> >> > >> I/O
> >> > >> > > > thread
> >> > >> > > > > > time
> >> > >> > > > > > > > as
> >> > >> > > > > > > > > in
> >> > >> > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > > > > proposal,
> >> > >> > > > > > > > > > > > > > > >    but the network thread work is
> >> actually
> >> > >> > > > > non-trivial
> >> > >> > > > > > > > (e.g.
> >> > >> > > > > > > > > > all
> >> > >> > > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > > disk
> >> > >> > > > > > > > > > > > > > > >    reads for fetches happen in that
> >> > >> thread). If
> >> > >> > > you
> >> > >> > > > > > count
> >> > >> > > > > > > > > both
> >> > >> > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > > > network
> >> > >> > > > > > > > > > > > > > > > and
> >> > >> > > > > > > > > > > > > > > >    I/O threads it can skew things a
> >> bit.
> >> > >> E.g.
> >> > >> > say
> >> > >> > > > you
> >> > >> > > > > > > have
> >> > >> > > > > > > > 50
> >> > >> > > > > > > > > > > > network
> >> > >> > > > > > > > > > > > > > > > threads,
> >> > >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores, what
> is
> >> > the
> >> > >> > > > available
> >> > >> > > > > > cpu
> >> > >> > > > > > > > > time
> >> > >> > > > > > > > > > > > > > available
> >> > >> > > > > > > > > > > > > > > > in a
> >> > >> > > > > > > > > > > > > > > >    second? I suppose this is a
> problem
> >> > >> whenever
> >> > >> > > you
> >> > >> > > > > > have
> >> > >> > > > > > > a
> >> > >> > > > > > > > > > > > bottleneck
> >> > >> > > > > > > > > > > > > > > > between
> >> > >> > > > > > > > > > > > > > > >    I/O and network threads or if you
> >> end
> >> > up
> >> > >> > > > > > significantly
> >> > >> > > > > > > > > > > > > > > over-provisioning
> >> > >> > > > > > > > > > > > > > > >    one pool (both of which are hard
> to
> >> > >> avoid).
> >> > >> > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > An alternative for CPU throttling
> >> would be
> >> > >> to
> >> > >> > use
> >> > >> > > > > this
> >> > >> > > > > > > api:
> >> > >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> >> > >> > > > > > 1.5.0/docs/api/java/lang/
> >> > >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
> >> > >> > > > getThreadCpuTime(long)
> >> > >> > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > That would let you track actual CPU
> >> usage
> >> > >> > across
> >> > >> > > > the
> >> > >> > > > > > > > network,
> >> > >> > > > > > > > > > I/O
> >> > >> > > > > > > > > > > > > > > threads,
> >> > >> > > > > > > > > > > > > > > > and purgatory threads and look at it
> >> as a
> >> > >> > > > percentage
> >> > >> > > > > of
> >> > >> > > > > > > > total
> >> > >> > > > > > > > > > > > cores.
> >> > >> > > > > > > > > > > > > I
> >> > >> > > > > > > > > > > > > > > > think this fixes many problems in the
> >> > >> > reliability
> >> > >> > > > of
> >> > >> > > > > > the
> >> > >> > > > > > > > > > metric.
> >> > >> > > > > > > > > > > > It's
> >> > >> > > > > > > > > > > > > > > > meaning is slightly different as it
> is
> >> > just
> >> > >> CPU
> >> > >> > > > (you
> >> > >> > > > > > > don't
> >> > >> > > > > > > > > get
> >> > >> > > > > > > > > > > > > charged
> >> > >> > > > > > > > > > > > > > > for
> >> > >> > > > > > > > > > > > > > > > time blocking on I/O) but that may be
> >> okay
> >> > >> > > because
> >> > >> > > > we
> >> > >> > > > > > > > already
> >> > >> > > > > > > > > > > have
> >> > >> > > > > > > > > > > > a
> >> > >> > > > > > > > > > > > > > > > throttle on I/O. The downside is I
> >> think
> >> > it
> >> > >> is
> >> > >> > > > > possible
> >> > >> > > > > > > > this
> >> > >> > > > > > > > > > api
> >> > >> > > > > > > > > > > > can
> >> > >> > > > > > > > > > > > > be
> >> > >> > > > > > > > > > > > > > > > disabled or isn't always available
> and
> >> it
> >> > >> may
> >> > >> > > also
> >> > >> > > > be
> >> > >> > > > > > > > > expensive
> >> > >> > > > > > > > > > > > (also
> >> > >> > > > > > > > > > > > > > > I've
> >> > >> > > > > > > > > > > > > > > > never used it so not sure if it
> really
> >> > works
> >> > >> > the
> >> > >> > > > way
> >> > >> > > > > i
> >> > >> > > > > > > > > think).
> >> > >> > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > -Jay
> >> > >> > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM,
> Becket
> >> > Qin
> >> > >> <
> >> > >> > > > > > > > > > > becket.qin@gmail.com>
> >> > >> > > > > > > > > > > > > > > wrote:
> >> > >> > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > If the purpose of the KIP is only
> to
> >> > >> protect
> >> > >> > > the
> >> > >> > > > > > > cluster
> >> > >> > > > > > > > > from
> >> > >> > > > > > > > > > > > being
> >> > >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and is
> >> not
> >> > >> > > intended
> >> > >> > > > to
> >> > >> > > > > > > > address
> >> > >> > > > > > > > > > > > > resource
> >> > >> > > > > > > > > > > > > > > > > allocation problem among the
> >> clients, I
> >> > am
> >> > >> > > > > wondering
> >> > >> > > > > > if
> >> > >> > > > > > > > > using
> >> > >> > > > > > > > > > > > > request
> >> > >> > > > > > > > > > > > > > > > > handling time quota (CPU time
> quota)
> >> is
> >> > a
> >> > >> > > better
> >> > >> > > > > > > option.
> >> > >> > > > > > > > > Here
> >> > >> > > > > > > > > > > are
> >> > >> > > > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > > > > > reasons:
> >> > >> > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > 1. request handling time quota has
> >> > better
> >> > >> > > > > protection.
> >> > >> > > > > > > Say
> >> > >> > > > > > > > > we
> >> > >> > > > > > > > > > > have
> >> > >> > > > > > > > > > > > > > > request
> >> > >> > > > > > > > > > > > > > > > > rate quota and set that to some
> value
> >> > like
> >> > >> > 100
> >> > >> > > > > > > > > requests/sec,
> >> > >> > > > > > > > > > it
> >> > >> > > > > > > > > > > > is
> >> > >> > > > > > > > > > > > > > > > possible
> >> > >> > > > > > > > > > > > > > > > > that some of the requests are very
> >> > >> expensive
> >> > >> > > > > actually
> >> > >> > > > > > > > take
> >> > >> > > > > > > > > a
> >> > >> > > > > > > > > > > lot
> >> > >> > > > > > > > > > > > of
> >> > >> > > > > > > > > > > > > > > time
> >> > >> > > > > > > > > > > > > > > > to
> >> > >> > > > > > > > > > > > > > > > > handle. In that case a few clients
> >> may
> >> > >> still
> >> > >> > > > > occupy a
> >> > >> > > > > > > lot
> >> > >> > > > > > > > > of
> >> > >> > > > > > > > > > > CPU
> >> > >> > > > > > > > > > > > > time
> >> > >> > > > > > > > > > > > > > > > even
> >> > >> > > > > > > > > > > > > > > > > the request rate is low. Arguably
> we
> >> can
> >> > >> > > > carefully
> >> > >> > > > > > set
> >> > >> > > > > > > > > > request
> >> > >> > > > > > > > > > > > rate
> >> > >> > > > > > > > > > > > > > > quota
> >> > >> > > > > > > > > > > > > > > > > for each request and client id
> >> > >> combination,
> >> > >> > but
> >> > >> > > > it
> >> > >> > > > > > > could
> >> > >> > > > > > > > > > still
> >> > >> > > > > > > > > > > be
> >> > >> > > > > > > > > > > > > > > tricky
> >> > >> > > > > > > > > > > > > > > > to
> >> > >> > > > > > > > > > > > > > > > > get it right for everyone.
> >> > >> > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > If we use the request time handling
> >> > >> quota, we
> >> > >> > > can
> >> > >> > > > > > > simply
> >> > >> > > > > > > > > say
> >> > >> > > > > > > > > > no
> >> > >> > > > > > > > > > > > > > clients
> >> > >> > > > > > > > > > > > > > > > can
> >> > >> > > > > > > > > > > > > > > > > take up to more than 30% of the
> total
> >> > >> request
> >> > >> > > > > > handling
> >> > >> > > > > > > > > > capacity
> >> > >> > > > > > > > > > > > > > > (measured
> >> > >> > > > > > > > > > > > > > > > > by time), regardless of the
> >> difference
> >> > >> among
> >> > >> > > > > > different
> >> > >> > > > > > > > > > requests
> >> > >> > > > > > > > > > > > or
> >> > >> > > > > > > > > > > > > > what
> >> > >> > > > > > > > > > > > > > > > is
> >> > >> > > > > > > > > > > > > > > > > the client doing. In this case
> maybe
> >> we
> >> > >> can
> >> > >> > > quota
> >> > >> > > > > all
> >> > >> > > > > > > the
> >> > >> > > > > > > > > > > > requests
> >> > >> > > > > > > > > > > > > if
> >> > >> > > > > > > > > > > > > > > we
> >> > >> > > > > > > > > > > > > > > > > want to.
> >> > >> > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > 2. The main benefit of using
> request
> >> > rate
> >> > >> > limit
> >> > >> > > > is
> >> > >> > > > > > that
> >> > >> > > > > > > > it
> >> > >> > > > > > > > > > > seems
> >> > >> > > > > > > > > > > > > more
> >> > >> > > > > > > > > > > > > > > > > intuitive. It is true that it is
> >> > probably
> >> > >> > > easier
> >> > >> > > > to
> >> > >> > > > > > > > explain
> >> > >> > > > > > > > > > to
> >> > >> > > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > > user
> >> > >> > > > > > > > > > > > > > > > > what does that mean. However, in
> >> > practice
> >> > >> it
> >> > >> > > > looks
> >> > >> > > > > > the
> >> > >> > > > > > > > > impact
> >> > >> > > > > > > > > > > of
> >> > >> > > > > > > > > > > > > > > request
> >> > >> > > > > > > > > > > > > > > > > rate quota is not more quantifiable
> >> than
> >> > >> the
> >> > >> > > > > request
> >> > >> > > > > > > > > handling
> >> > >> > > > > > > > > > > > time
> >> > >> > > > > > > > > > > > > > > quota.
> >> > >> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is
> >> still
> >> > >> > > difficult
> >> > >> > > > > to
> >> > >> > > > > > > > give a
> >> > >> > > > > > > > > > > > number
> >> > >> > > > > > > > > > > > > > > about
> >> > >> > > > > > > > > > > > > > > > > impact of throughput or latency
> when
> >> a
> >> > >> > request
> >> > >> > > > rate
> >> > >> > > > > > > quota
> >> > >> > > > > > > > > is
> >> > >> > > > > > > > > > > hit.
> >> > >> > > > > > > > > > > > > So
> >> > >> > > > > > > > > > > > > > it
> >> > >> > > > > > > > > > > > > > > > is
> >> > >> > > > > > > > > > > > > > > > > not better than the request
> handling
> >> > time
> >> > >> > > quota.
> >> > >> > > > In
> >> > >> > > > > > > fact
> >> > >> > > > > > > > I
> >> > >> > > > > > > > > > feel
> >> > >> > > > > > > > > > > > it
> >> > >> > > > > > > > > > > > > is
> >> > >> > > > > > > > > > > > > > > > > clearer to tell user that "you are
> >> > limited
> >> > >> > > > because
> >> > >> > > > > > you
> >> > >> > > > > > > > have
> >> > >> > > > > > > > > > > taken
> >> > >> > > > > > > > > > > > > 30%
> >> > >> > > > > > > > > > > > > > > of
> >> > >> > > > > > > > > > > > > > > > > the CPU time on the broker" than
> >> > otherwise
> >> > >> > > > > something
> >> > >> > > > > > > like
> >> > >> > > > > > > > > > "your
> >> > >> > > > > > > > > > > > > > request
> >> > >> > > > > > > > > > > > > > > > > rate quota on metadata request has
> >> > >> reached".
> >> > >> > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > Thanks,
> >> > >> > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> >> > >> > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM,
> Jay
> >> > >> Kreps <
> >> > >> > > > > > > > > jay@confluent.io
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > > > > > wrote:
> >> > >> > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > I think this proposal makes a lot
> >> of
> >> > >> sense
> >> > >> > > > > > > (especially
> >> > >> > > > > > > > > now
> >> > >> > > > > > > > > > > that
> >> > >> > > > > > > > > > > > > it
> >> > >> > > > > > > > > > > > > > is
> >> > >> > > > > > > > > > > > > > > > > > oriented around request rate) and
> >> > fills
> >> > >> the
> >> > >> > > > > biggest
> >> > >> > > > > > > > > > remaining
> >> > >> > > > > > > > > > > > gap
> >> > >> > > > > > > > > > > > > > in
> >> > >> > > > > > > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > > > > > > multi-tenancy story.
> >> > >> > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > I think for intra-cluster
> >> > communication
> >> > >> > > > > > (StopReplica,
> >> > >> > > > > > > > > etc)
> >> > >> > > > > > > > > > we
> >> > >> > > > > > > > > > > > > could
> >> > >> > > > > > > > > > > > > > > > avoid
> >> > >> > > > > > > > > > > > > > > > > > throttling entirely. You can
> >> secure or
> >> > >> > > > otherwise
> >> > >> > > > > > > > > lock-down
> >> > >> > > > > > > > > > > the
> >> > >> > > > > > > > > > > > > > > cluster
> >> > >> > > > > > > > > > > > > > > > > > communication to avoid any
> >> > unauthorized
> >> > >> > > > external
> >> > >> > > > > > > party
> >> > >> > > > > > > > > from
> >> > >> > > > > > > > > > > > > trying
> >> > >> > > > > > > > > > > > > > to
> >> > >> > > > > > > > > > > > > > > > > > initiate these requests. As a
> >> result
> >> > we
> >> > >> are
> >> > >> > > as
> >> > >> > > > > > likely
> >> > >> > > > > > > > to
> >> > >> > > > > > > > > > > cause
> >> > >> > > > > > > > > > > > > > > problems
> >> > >> > > > > > > > > > > > > > > > > as
> >> > >> > > > > > > > > > > > > > > > > > solve them by throttling these,
> >> right?
> >> > >> > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > I'm not so sure that we should
> >> exempt
> >> > >> the
> >> > >> > > > > consumer
> >> > >> > > > > > > > > requests
> >> > >> > > > > > > > > > > > such
> >> > >> > > > > > > > > > > > > as
> >> > >> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we
> >> > >> throttle an
> >> > >> > > > app's
> >> > >> > > > > > > > > heartbeat
> >> > >> > > > > > > > > > > > > > requests
> >> > >> > > > > > > > > > > > > > > it
> >> > >> > > > > > > > > > > > > > > > > may
> >> > >> > > > > > > > > > > > > > > > > > cause it to fall out of its
> >> consumer
> >> > >> group.
> >> > >> > > > > However
> >> > >> > > > > > > if
> >> > >> > > > > > > > we
> >> > >> > > > > > > > > > > don't
> >> > >> > > > > > > > > > > > > > > > throttle
> >> > >> > > > > > > > > > > > > > > > > it
> >> > >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the
> >> > heartbeat
> >> > >> > > > interval
> >> > >> > > > > > is
> >> > >> > > > > > > > set
> >> > >> > > > > > > > > > > > > > incorrectly
> >> > >> > > > > > > > > > > > > > > or
> >> > >> > > > > > > > > > > > > > > > > if
> >> > >> > > > > > > > > > > > > > > > > > some client in some language has
> a
> >> > bug.
> >> > >> I
> >> > >> > > think
> >> > >> > > > > the
> >> > >> > > > > > > > > policy
> >> > >> > > > > > > > > > > with
> >> > >> > > > > > > > > > > > > > this
> >> > >> > > > > > > > > > > > > > > > kind
> >> > >> > > > > > > > > > > > > > > > > > of throttling is to protect the
> >> > cluster
> >> > >> > above
> >> > >> > > > any
> >> > >> > > > > > > > > > individual
> >> > >> > > > > > > > > > > > app,
> >> > >> > > > > > > > > > > > > > > > right?
> >> > >> > > > > > > > > > > > > > > > > I
> >> > >> > > > > > > > > > > > > > > > > > think in general this should be
> >> okay
> >> > >> since
> >> > >> > > for
> >> > >> > > > > most
> >> > >> > > > > > > > > > > deployments
> >> > >> > > > > > > > > > > > > > this
> >> > >> > > > > > > > > > > > > > > > > > setting is meant as more of a
> >> safety
> >> > >> > > > valve---that
> >> > >> > > > > > is
> >> > >> > > > > > > > > rather
> >> > >> > > > > > > > > > > > than
> >> > >> > > > > > > > > > > > > > set
> >> > >> > > > > > > > > > > > > > > > > > something very close to what you
> >> > expect
> >> > >> to
> >> > >> > > need
> >> > >> > > > > > (say
> >> > >> > > > > > > 2
> >> > >> > > > > > > > > > > req/sec
> >> > >> > > > > > > > > > > > or
> >> > >> > > > > > > > > > > > > > > > > whatever)
> >> > >> > > > > > > > > > > > > > > > > > you would have something quite
> high
> >> > >> (like
> >> > >> > 100
> >> > >> > > > > > > req/sec)
> >> > >> > > > > > > > > with
> >> > >> > > > > > > > > > > > this
> >> > >> > > > > > > > > > > > > > > meant
> >> > >> > > > > > > > > > > > > > > > to
> >> > >> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I
> >> think
> >> > >> when
> >> > >> > > used
> >> > >> > > > > this
> >> > >> > > > > > > way
> >> > >> > > > > > > > > > > > allowing
> >> > >> > > > > > > > > > > > > > > those
> >> > >> > > > > > > > > > > > > > > > to
> >> > >> > > > > > > > > > > > > > > > > > be throttled would actually
> provide
> >> > >> > > meaningful
> >> > >> > > > > > > > > protection.
> >> > >> > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > -Jay
> >> > >> > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM,
> >> > Rajini
> >> > >> > > > Sivaram <
> >> > >> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> >> > >> > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > wrote:
> >> > >> > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > > Hi all,
> >> > >> > > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > > I have just created KIP-124 to
> >> > >> introduce
> >> > >> > > > > request
> >> > >> > > > > > > rate
> >> > >> > > > > > > > > > > quotas
> >> > >> > > > > > > > > > > > to
> >> > >> > > > > > > > > > > > > > > > Kafka:
> >> > >> > > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
> >> > >> > > > > > > > confluence/display/KAFKA/KIP-
> >> > >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> >> > >> > > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > > The proposal is for a simple
> >> > >> percentage
> >> > >> > > > request
> >> > >> > > > > > > > > handling
> >> > >> > > > > > > > > > > time
> >> > >> > > > > > > > > > > > > > quota
> >> > >> > > > > > > > > > > > > > > > > that
> >> > >> > > > > > > > > > > > > > > > > > > can be allocated to
> >> *<client-id>*,
> >> > >> > *<user>*
> >> > >> > > > or
> >> > >> > > > > > > > *<user,
> >> > >> > > > > > > > > > > > > > client-id>*.
> >> > >> > > > > > > > > > > > > > > > > There
> >> > >> > > > > > > > > > > > > > > > > > > are a few other suggestions
> also
> >> > under
> >> > >> > > > > "Rejected
> >> > >> > > > > > > > > > > > alternatives".
> >> > >> > > > > > > > > > > > > > > > > Feedback
> >> > >> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
> >> > >> > > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > > Thank you...
> >> > >> > > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > > Regards,
> >> > >> > > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > > > Rajini
> >> > >> > > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > > > --
> >> > >> > > > > > > > > > > > > > > -- Guozhang
> >> > >> > > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > > >
> >> > >> > > > > > > > > > > > >
> >> > >> > > > > > > > > > > >
> >> > >> > > > > > > > > > >
> >> > >> > > > > > > > > >
> >> > >> > > > > > > > >
> >> > >> > > > > > > >
> >> > >> > > > > > >
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > >
> >> > >> > > > > > --
> >> > >> > > > > > -- Guozhang
> >> > >> > > > > >
> >> > >> > > > >
> >> > >> > > >
> >> > >> > >
> >> > >> >
> >> > >>
> >> > >
> >> > >
> >> >
> >>
> >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
I have updated the KIP based on the discussions so far.


Regards,

Rajini

On Thu, Feb 23, 2017 at 11:29 PM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Thank you all for the feedback.
>
> Ismael #1. It makes sense not to throttle inter-broker requests like
> LeaderAndIsr etc. The simplest way to ensure that clients cannot use these
> requests to bypass quotas for DoS attacks is to ensure that ACLs prevent
> clients from using these requests and unauthorized requests are included
> towards quotas.
>
> Ismael #2, Jay #1 : I was thinking that these quotas can return a separate
> throttle time, and all utilization based quotas could use the same field
> (we won't add another one for network thread utilization for instance). But
> perhaps it makes sense to keep byte rate quotas separate in produce/fetch
> responses to provide separate metrics? Agree with Ismael that the name of
> the existing field should be changed if we have two. Happy to switch to a
> single combined throttle time if that is sufficient.
>
> Ismael #4, #5, #6: Will update KIP. Will use dot separated name for new
> property. Replication quotas use dot separated, so it will be consistent
> with all properties except byte rate quotas.
>
> Radai: #1 Request processing time rather than request rate were chosen
> because the time per request can vary significantly between requests as
> mentioned in the discussion and KIP.
> #2 Two separate quotas for heartbeats/regular requests feel like more
> configuration and more metrics. Since most users would set quotas higher
> than the expected usage and quotas are more of a safety net, a single quota
> should work in most cases.
>  #3 The number of requests in purgatory is limited by the number of active
> connections since only one request per connection will be throttled at a
> time.
> #4 As with byte rate quotas, to use the full allocated quotas,
> clients/users would need to use partitions that are distributed across the
> cluster. The alternative of using cluster-wide quotas instead of per-broker
> quotas would be far too complex to implement.
>
> Dong : We currently have two ClientQuotaManagers for quota types Fetch and
> Produce. A new one will be added for IOThread, which manages quotas for I/O
> thread utilization. This will not update the Fetch or Produce queue-size,
> but will have a separate metric for the queue-size.  I wasn't planning to
> add any additional metrics apart from the equivalent ones for existing
> quotas as part of this KIP. Ratio of byte-rate to I/O thread utilization
> could be slightly misleading since it depends on the sequence of requests.
> But we can look into more metrics after the KIP is implemented if required.
>
> I think we need to limit the maximum delay since all requests are
> throttled. If a client has a quota of 0.001 units and a single request used
> 50ms, we don't want to delay all requests from the client by 50 seconds,
> throwing the client out of all its consumer groups. The issue is only if a
> user is allocated a quota that is insufficient to process one large
> request. The expectation is that the units allocated per user will be much
> higher than the time taken to process one request and the limit should
> seldom be applied. Agree this needs proper documentation.
>
> Regards,
>
> Rajini
>
>
> On Thu, Feb 23, 2017 at 8:04 PM, radai <ra...@gmail.com> wrote:
>
>> @jun: i wasnt concerned about tying up a request processing thread, but
>> IIUC the code does still read the entire request out, which might add-up
>> to
>> a non-negligible amount of memory.
>>
>> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com> wrote:
>>
>> > Hey Rajini,
>> >
>> > The current KIP says that the maximum delay will be reduced to window
>> size
>> > if it is larger than the window size. I have a concern with this:
>> >
>> > 1) This essentially means that the user is allowed to exceed their quota
>> > over a long period of time. Can you provide an upper bound on this
>> > deviation?
>> >
>> > 2) What is the motivation for cap the maximum delay by the window size?
>> I
>> > am wondering if there is better alternative to address the problem.
>> >
>> > 3) It means that the existing metric-related config will have a more
>> > directly impact on the mechanism of this io-thread-unit-based quota. The
>> > may be an important change depending on the answer to 1) above. We
>> probably
>> > need to document this more explicitly.
>> >
>> > Dong
>> >
>> >
>> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <li...@gmail.com> wrote:
>> >
>> > > Hey Jun,
>> > >
>> > > Yeah you are right. I thought it wasn't because at LinkedIn it will be
>> > too
>> > > much pressure on inGraph to expose those per-clientId metrics so we
>> ended
>> > > up printing them periodically to local log. Never mind if it is not a
>> > > general problem.
>> > >
>> > > Hey Rajini,
>> > >
>> > > - I agree with Jay that we probably don't want to add a new field for
>> > > every quota ProduceResponse or FetchResponse. Is there any use-case
>> for
>> > > having separate throttle-time fields for byte-rate-quota and
>> > > io-thread-unit-quota? You probably need to document this as interface
>> > > change if you plan to add new field in any request.
>> > >
>> > > - I don't think IOThread belongs to quotaType. The existing quota
>> types
>> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication) identify
>> the
>> > > type of request that are throttled, not the quota mechanism that is
>> > applied.
>> > >
>> > > - If a request is throttled due to this io-thread-unit-based quota, is
>> > the
>> > > existing queue-size metric in ClientQuotaManager incremented?
>> > >
>> > > - In the interest of providing guide line for admin to decide
>> > > io-thread-unit-based quota and for user to understand its impact on
>> their
>> > > traffic, would it be useful to have a metric that shows the overall
>> > > byte-rate per io-thread-unit? Can we also show this a per-clientId
>> > metric?
>> > >
>> > > Thanks,
>> > > Dong
>> > >
>> > >
>> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io> wrote:
>> > >
>> > >> Hi, Ismael,
>> > >>
>> > >> For #3, typically, an admin won't configure more io threads than CPU
>> > >> cores,
>> > >> but it's possible for an admin to start with fewer io threads than
>> cores
>> > >> and grow that later on.
>> > >>
>> > >> Hi, Dong,
>> > >>
>> > >> I think the throttleTime sensor on the broker tells the admin
>> whether a
>> > >> user/clentId is throttled or not.
>> > >>
>> > >> Hi, Radi,
>> > >>
>> > >> The reasoning for delaying the throttled requests on the broker
>> instead
>> > of
>> > >> returning an error immediately is that the latter has no way to
>> prevent
>> > >> the
>> > >> client from retrying immediately, which will make things worse. The
>> > >> delaying logic is based off a delay queue. A separate expiration
>> thread
>> > >> just waits on the next to be expired request. So, it doesn't tie up a
>> > >> request handler thread.
>> > >>
>> > >> Thanks,
>> > >>
>> > >> Jun
>> > >>
>> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <is...@juma.me.uk>
>> wrote:
>> > >>
>> > >> > Hi Jay,
>> > >> >
>> > >> > Regarding 1, I definitely like the simplicity of keeping a single
>> > >> throttle
>> > >> > time field in the response. The downside is that the client metrics
>> > >> will be
>> > >> > more coarse grained.
>> > >> >
>> > >> > Regarding 3, we have `leader.imbalance.per.broker.percentage` and
>> > >> > `log.cleaner.min.cleanable.ratio`.
>> > >> >
>> > >> > Ismael
>> > >> >
>> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io>
>> wrote:
>> > >> >
>> > >> > > A few minor comments:
>> > >> > >
>> > >> > >    1. Isn't it the case that the throttling time response field
>> > should
>> > >> > have
>> > >> > >    the total time your request was throttled irrespective of the
>> > >> quotas
>> > >> > > that
>> > >> > >    caused that. Limiting it to byte rate quota doesn't make
>> sense,
>> > >> but I
>> > >> > > also
>> > >> > >    I don't think we want to end up adding new fields in the
>> response
>> > >> for
>> > >> > > every
>> > >> > >    single thing we quota, right?
>> > >> > >    2. I don't think we should make this quota specifically about
>> io
>> > >> > >    threads. Once we introduce these quotas people set them and
>> > expect
>> > >> > them
>> > >> > > to
>> > >> > >    be enforced (and if they aren't it may cause an outage). As a
>> > >> result
>> > >> > > they
>> > >> > >    are a bit more sensitive than normal configs, I think. The
>> > current
>> > >> > > thread
>> > >> > >    pools seem like something of an implementation detail and not
>> the
>> > >> > level
>> > >> > > the
>> > >> > >    user-facing quotas should be involved with. I think it might
>> be
>> > >> better
>> > >> > > to
>> > >> > >    make this a general request-time throttle with no mention in
>> the
>> > >> > naming
>> > >> > >    about I/O threads and simply acknowledge the current
>> limitation
>> > >> (which
>> > >> > > we
>> > >> > >    may someday fix) in the docs that this covers only the time
>> after
>> > >> the
>> > >> > >    thread is read off the network.
>> > >> > >    3. As such I think the right interface to the user would be
>> > >> something
>> > >> > >    like percent_request_time and be in {0,...100} or
>> > >> request_time_ratio
>> > >> > > and be
>> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we used
>> if
>> > the
>> > >> > > scale
>> > >> > >    is between 0 and 1 in the other metrics, right?)
>> > >> > >
>> > >> > > -Jay
>> > >> > >
>> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
>> > >> rajinisivaram@gmail.com
>> > >> > >
>> > >> > > wrote:
>> > >> > >
>> > >> > > > Guozhang/Dong,
>> > >> > > >
>> > >> > > > Thank you for the feedback.
>> > >> > > >
>> > >> > > > Guozhang : I have updated the section on co-existence of byte
>> rate
>> > >> and
>> > >> > > > request time quotas.
>> > >> > > >
>> > >> > > > Dong: I hadn't added much detail to the metrics and sensors
>> since
>> > >> they
>> > >> > > are
>> > >> > > > going to be very similar to the existing metrics and sensors.
>> To
>> > >> avoid
>> > >> > > > confusion, I have now added more detail. All metrics are in the
>> > >> group
>> > >> > > > "quotaType" and all sensors have names starting with
>> "quotaType"
>> > >> (where
>> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
>> > >> > > > FollowerReplication/*IOThread*).
>> > >> > > > So there will be no reuse of existing metrics/sensors. The new
>> > ones
>> > >> for
>> > >> > > > request processing time based throttling will be completely
>> > >> independent
>> > >> > > of
>> > >> > > > existing metrics/sensors, but will be consistent in format.
>> > >> > > >
>> > >> > > > The existing throttle_time_ms field in produce/fetch responses
>> > will
>> > >> not
>> > >> > > be
>> > >> > > > impacted by this KIP. That will continue to return byte-rate
>> based
>> > >> > > > throttling times. In addition, a new field
>> > request_throttle_time_ms
>> > >> > will
>> > >> > > be
>> > >> > > > added to return request quota based throttling times. These
>> will
>> > be
>> > >> > > exposed
>> > >> > > > as new metrics on the client-side.
>> > >> > > >
>> > >> > > > Since all metrics and sensors are different for each type of
>> > quota,
>> > >> I
>> > >> > > > believe there is already sufficient metrics to monitor
>> throttling
>> > on
>> > >> > both
>> > >> > > > client and broker side for each type of throttling.
>> > >> > > >
>> > >> > > > Regards,
>> > >> > > >
>> > >> > > > Rajini
>> > >> > > >
>> > >> > > >
>> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <lindong28@gmail.com
>> >
>> > >> wrote:
>> > >> > > >
>> > >> > > > > Hey Rajini,
>> > >> > > > >
>> > >> > > > > I think it makes a lot of sense to use io_thread_units as
>> metric
>> > >> to
>> > >> > > quota
>> > >> > > > > user's traffic here. LGTM overall. I have some questions
>> > regarding
>> > >> > > > sensors.
>> > >> > > > >
>> > >> > > > > - Can you be more specific in the KIP what sensors will be
>> > added?
>> > >> For
>> > >> > > > > example, it will be useful to specify the name and
>> attributes of
>> > >> > these
>> > >> > > > new
>> > >> > > > > sensors.
>> > >> > > > >
>> > >> > > > > - We currently have throttle-time and queue-size for
>> byte-rate
>> > >> based
>> > >> > > > quota.
>> > >> > > > > Are you going to have separate throttle-time and queue-size
>> for
>> > >> > > requests
>> > >> > > > > throttled by io_thread_unit-based quota, or will they share
>> the
>> > >> same
>> > >> > > > > sensor?
>> > >> > > > >
>> > >> > > > > - Does the throttle-time in the ProduceResponse and
>> > FetchResponse
>> > >> > > > contains
>> > >> > > > > time due to io_thread_unit-based quota?
>> > >> > > > >
>> > >> > > > > - Currently kafka server doesn't not provide any log or
>> metrics
>> > >> that
>> > >> > > > tells
>> > >> > > > > whether any given clientId (or user) is throttled. This is
>> not
>> > too
>> > >> > bad
>> > >> > > > > because we can still check the client-side byte-rate metric
>> to
>> > >> > validate
>> > >> > > > > whether a given client is throttled. But with this
>> > io_thread_unit,
>> > >> > > there
>> > >> > > > > will be no way to validate whether a given client is slow
>> > because
>> > >> it
>> > >> > > has
>> > >> > > > > exceeded its io_thread_unit limit. It is necessary for user
>> to
>> > be
>> > >> > able
>> > >> > > to
>> > >> > > > > know this information to figure how whether they have reached
>> > >> there
>> > >> > > quota
>> > >> > > > > limit. How about we add log4j log on the server side to
>> > >> periodically
>> > >> > > > print
>> > >> > > > > the (client_id, byte-rate-throttle-time,
>> > >> > io-thread-unit-throttle-time)
>> > >> > > so
>> > >> > > > > that kafka administrator can figure those users that have
>> > reached
>> > >> > their
>> > >> > > > > limit and act accordingly?
>> > >> > > > >
>> > >> > > > > Thanks,
>> > >> > > > > Dong
>> > >> > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > >
>> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
>> > >> wangguoz@gmail.com>
>> > >> > > > wrote:
>> > >> > > > >
>> > >> > > > > > Made a pass over the doc, overall LGTM except a minor
>> comment
>> > on
>> > >> > the
>> > >> > > > > > throttling implementation:
>> > >> > > > > >
>> > >> > > > > > Stated as "Request processing time throttling will be
>> applied
>> > on
>> > >> > top
>> > >> > > if
>> > >> > > > > > necessary." I thought that it meant the request processing
>> > time
>> > >> > > > > throttling
>> > >> > > > > > is applied first, but continue reading I found it actually
>> > >> meant to
>> > >> > > > apply
>> > >> > > > > > produce / fetch byte rate throttling first.
>> > >> > > > > >
>> > >> > > > > > Also the last sentence "The remaining delay if any is
>> applied
>> > to
>> > >> > the
>> > >> > > > > > response." is a bit confusing to me. Maybe rewording it a
>> bit?
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > Guozhang
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <jun@confluent.io
>> >
>> > >> wrote:
>> > >> > > > > >
>> > >> > > > > > > Hi, Rajini,
>> > >> > > > > > >
>> > >> > > > > > > Thanks for the updated KIP. The latest proposal looks
>> good
>> > to
>> > >> me.
>> > >> > > > > > >
>> > >> > > > > > > Jun
>> > >> > > > > > >
>> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
>> > >> > > > > rajinisivaram@gmail.com
>> > >> > > > > > >
>> > >> > > > > > > wrote:
>> > >> > > > > > >
>> > >> > > > > > > > Jun/Roger,
>> > >> > > > > > > >
>> > >> > > > > > > > Thank you for the feedback.
>> > >> > > > > > > >
>> > >> > > > > > > > 1. I have updated the KIP to use absolute units
>> instead of
>> > >> > > > > percentage.
>> > >> > > > > > > The
>> > >> > > > > > > > property is called* io_thread_units* to align with the
>> > >> thread
>> > >> > > count
>> > >> > > > > > > > property *num.io.threads*. When we implement network
>> > thread
>> > >> > > > > utilization
>> > >> > > > > > > > quotas, we can add another property
>> > *network_thread_units.*
>> > >> > > > > > > >
>> > >> > > > > > > > 2. ControlledShutdown is already listed under the
>> exempt
>> > >> > > requests.
>> > >> > > > > Jun,
>> > >> > > > > > > did
>> > >> > > > > > > > you mean a different request that needs to be added?
>> The
>> > >> four
>> > >> > > > > requests
>> > >> > > > > > > > currently exempt in the KIP are StopReplica,
>> > >> > ControlledShutdown,
>> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are controlled
>> > using
>> > >> > > > > > ClusterAction
>> > >> > > > > > > > ACL, so it is easy to exclude and only throttle if
>> > >> > unauthorized.
>> > >> > > I
>> > >> > > > > > wasn't
>> > >> > > > > > > > sure if there are other requests used only for
>> > inter-broker
>> > >> > that
>> > >> > > > > needed
>> > >> > > > > > > to
>> > >> > > > > > > > be excluded.
>> > >> > > > > > > >
>> > >> > > > > > > > 3. I was thinking the smallest change would be to
>> replace
>> > >> all
>> > >> > > > > > references
>> > >> > > > > > > to
>> > >> > > > > > > > *requestChannel.sendResponse()* with a local method
>> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the throttling
>> if
>> > >> any
>> > >> > > plus
>> > >> > > > > send
>> > >> > > > > > > > response. If we throttle first in *KafkaApis.handle()*,
>> > the
>> > >> > time
>> > >> > > > > spent
>> > >> > > > > > > > within the method handling the request will not be
>> > recorded
>> > >> or
>> > >> > > used
>> > >> > > > > in
>> > >> > > > > > > > throttling. We can look into this again when the PR is
>> > ready
>> > >> > for
>> > >> > > > > > review.
>> > >> > > > > > > >
>> > >> > > > > > > > Regards,
>> > >> > > > > > > >
>> > >> > > > > > > > Rajini
>> > >> > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
>> > >> > > > > roger.hoover@gmail.com>
>> > >> > > > > > > > wrote:
>> > >> > > > > > > >
>> > >> > > > > > > > > Great to see this KIP and the excellent discussion.
>> > >> > > > > > > > >
>> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
>> application
>> > is
>> > >> > > > > allocated
>> > >> > > > > > 1
>> > >> > > > > > > > > request handler unit, then it's as if I have a Kafka
>> > >> broker
>> > >> > > with
>> > >> > > > a
>> > >> > > > > > > single
>> > >> > > > > > > > > request handler thread dedicated to me.  That's the
>> > most I
>> > >> > can
>> > >> > > > use,
>> > >> > > > > > at
>> > >> > > > > > > > > least.  That allocation doesn't change even if an
>> admin
>> > >> later
>> > >> > > > > > increases
>> > >> > > > > > > > the
>> > >> > > > > > > > > size of the request thread pool on the broker.  It's
>> > >> similar
>> > >> > to
>> > >> > > > the
>> > >> > > > > > CPU
>> > >> > > > > > > > > abstraction that VMs and containers get from
>> hypervisors
>> > >> or
>> > >> > OS
>> > >> > > > > > > > schedulers.
>> > >> > > > > > > > > While different client access patterns can use wildly
>> > >> > different
>> > >> > > > > > amounts
>> > >> > > > > > > > of
>> > >> > > > > > > > > request thread resources per request, a given
>> > application
>> > >> > will
>> > >> > > > > > > generally
>> > >> > > > > > > > > have a stable access pattern and can figure out
>> > >> empirically
>> > >> > how
>> > >> > > > > many
>> > >> > > > > > > > > "request thread units" it needs to meet it's
>> > >> > throughput/latency
>> > >> > > > > > goals.
>> > >> > > > > > > > >
>> > >> > > > > > > > > Cheers,
>> > >> > > > > > > > >
>> > >> > > > > > > > > Roger
>> > >> > > > > > > > >
>> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
>> > >> jun@confluent.io>
>> > >> > > > wrote:
>> > >> > > > > > > > >
>> > >> > > > > > > > > > Hi, Rajini,
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Thanks for the updated KIP. A few more comments.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > 1. A concern of request_time_percent is that it's
>> not
>> > an
>> > >> > > > absolute
>> > >> > > > > > > > value.
>> > >> > > > > > > > > > Let's say you give a user a 10% limit. If the admin
>> > >> doubles
>> > >> > > the
>> > >> > > > > > > number
>> > >> > > > > > > > of
>> > >> > > > > > > > > > request handler threads, that user now actually has
>> > >> twice
>> > >> > the
>> > >> > > > > > > absolute
>> > >> > > > > > > > > > capacity. This may confuse people a bit. So,
>> perhaps
>> > >> > setting
>> > >> > > > the
>> > >> > > > > > > quota
>> > >> > > > > > > > > > based on an absolute request thread unit is better.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an
>> inter-broker
>> > >> > request
>> > >> > > > and
>> > >> > > > > > > needs
>> > >> > > > > > > > to
>> > >> > > > > > > > > > be excluded from throttling.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > 3. Implementation wise, I am wondering if it's
>> simpler
>> > >> to
>> > >> > > apply
>> > >> > > > > the
>> > >> > > > > > > > > request
>> > >> > > > > > > > > > time throttling first in KafkaApis.handle().
>> > Otherwise,
>> > >> we
>> > >> > > will
>> > >> > > > > > need
>> > >> > > > > > > to
>> > >> > > > > > > > > add
>> > >> > > > > > > > > > the throttling logic in each type of request.
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Thanks,
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > Jun
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
>> > >> > > > > > > > rajinisivaram@gmail.com
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > wrote:
>> > >> > > > > > > > > >
>> > >> > > > > > > > > > > Jun,
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > Thank you for the review.
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > I have reverted to the original KIP that
>> throttles
>> > >> based
>> > >> > on
>> > >> > > > > > request
>> > >> > > > > > > > > > handler
>> > >> > > > > > > > > > > utilization. At the moment, it uses percentage,
>> but
>> > I
>> > >> am
>> > >> > > > happy
>> > >> > > > > to
>> > >> > > > > > > > > change
>> > >> > > > > > > > > > to
>> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if
>> required. I
>> > >> have
>> > >> > > > added
>> > >> > > > > > the
>> > >> > > > > > > > > > examples
>> > >> > > > > > > > > > > from this discussion to the KIP. Also added a
>> > "Future
>> > >> > Work"
>> > >> > > > > > section
>> > >> > > > > > > > to
>> > >> > > > > > > > > > > address network thread utilization. The
>> > configuration
>> > >> is
>> > >> > > > named
>> > >> > > > > > > > > > > "request_time_percent" with the expectation that
>> it
>> > >> can
>> > >> > > also
>> > >> > > > be
>> > >> > > > > > > used
>> > >> > > > > > > > as
>> > >> > > > > > > > > > the
>> > >> > > > > > > > > > > limit for network thread utilization when that is
>> > >> > > > implemented,
>> > >> > > > > so
>> > >> > > > > > > > that
>> > >> > > > > > > > > > > users have to set only one config for the two and
>> > not
>> > >> > have
>> > >> > > to
>> > >> > > > > > worry
>> > >> > > > > > > > > about
>> > >> > > > > > > > > > > the internal distribution of the work between the
>> > two
>> > >> > > thread
>> > >> > > > > > pools
>> > >> > > > > > > in
>> > >> > > > > > > > > > > Kafka.
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > Regards,
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > Rajini
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
>> > >> > > jun@confluent.io>
>> > >> > > > > > > wrote:
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > > Hi, Rajini,
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > Thanks for the proposal.
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > The benefit of using the request processing
>> time
>> > >> over
>> > >> > the
>> > >> > > > > > request
>> > >> > > > > > > > > rate
>> > >> > > > > > > > > > is
>> > >> > > > > > > > > > > > exactly what people have said. I will just
>> expand
>> > >> that
>> > >> > a
>> > >> > > > bit.
>> > >> > > > > > > > > Consider
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > following case. The producer sends a produce
>> > request
>> > >> > > with a
>> > >> > > > > > 10MB
>> > >> > > > > > > > > > message
>> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
>> > >> decompression of
>> > >> > > the
>> > >> > > > > > > message
>> > >> > > > > > > > > on
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > broker could take 10-15 seconds, during which
>> > time,
>> > >> a
>> > >> > > > request
>> > >> > > > > > > > handler
>> > >> > > > > > > > > > > > thread is completely blocked. In this case,
>> > neither
>> > >> the
>> > >> > > > > byte-in
>> > >> > > > > > > > quota
>> > >> > > > > > > > > > nor
>> > >> > > > > > > > > > > > the request rate quota may be effective in
>> > >> protecting
>> > >> > the
>> > >> > > > > > broker.
>> > >> > > > > > > > > > > Consider
>> > >> > > > > > > > > > > > another case. A consumer group starts with 10
>> > >> instances
>> > >> > > and
>> > >> > > > > > later
>> > >> > > > > > > > on
>> > >> > > > > > > > > > > > switches to 20 instances. The request rate will
>> > >> likely
>> > >> > > > > double,
>> > >> > > > > > > but
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > > actually load on the broker may not double
>> since
>> > >> each
>> > >> > > fetch
>> > >> > > > > > > request
>> > >> > > > > > > > > > only
>> > >> > > > > > > > > > > > contains half of the partitions. Request rate
>> > quota
>> > >> may
>> > >> > > not
>> > >> > > > > be
>> > >> > > > > > > easy
>> > >> > > > > > > > > to
>> > >> > > > > > > > > > > > configure in this case.
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > What we really want is to be able to prevent a
>> > >> client
>> > >> > > from
>> > >> > > > > > using
>> > >> > > > > > > > too
>> > >> > > > > > > > > > much
>> > >> > > > > > > > > > > > of the server side resources. In this
>> particular
>> > >> KIP,
>> > >> > > this
>> > >> > > > > > > resource
>> > >> > > > > > > > > is
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > capacity of the request handler threads. I
>> agree
>> > >> that
>> > >> > it
>> > >> > > > may
>> > >> > > > > > not
>> > >> > > > > > > be
>> > >> > > > > > > > > > > > intuitive for the users to determine how to set
>> > the
>> > >> > right
>> > >> > > > > > limit.
>> > >> > > > > > > > > > However,
>> > >> > > > > > > > > > > > this is not completely new and has been done in
>> > the
>> > >> > > > container
>> > >> > > > > > > world
>> > >> > > > > > > > > > > > already. For example, Linux cgroup (
>> > >> > > > > https://access.redhat.com/
>> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
>> > >> terprise_Linux/6/html/
>> > >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html) has
>> the
>> > >> > concept
>> > >> > > of
>> > >> > > > > > > > > > > > cpu.cfs_quota_us,
>> > >> > > > > > > > > > > > which specifies the total amount of time in
>> > >> > microseconds
>> > >> > > > for
>> > >> > > > > > > which
>> > >> > > > > > > > > all
>> > >> > > > > > > > > > > > tasks in a cgroup can run during a one second
>> > >> period.
>> > >> > We
>> > >> > > > can
>> > >> > > > > > > > > > potentially
>> > >> > > > > > > > > > > > model the request handler threads in a similar
>> > way.
>> > >> For
>> > >> > > > > > example,
>> > >> > > > > > > > each
>> > >> > > > > > > > > > > > request handler thread can be 1 request handler
>> > unit
>> > >> > and
>> > >> > > > the
>> > >> > > > > > > admin
>> > >> > > > > > > > > can
>> > >> > > > > > > > > > > > configure a limit on how many units (say 0.01)
>> a
>> > >> client
>> > >> > > can
>> > >> > > > > > have.
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > Regarding not throttling the internal broker to
>> > >> broker
>> > >> > > > > > requests.
>> > >> > > > > > > We
>> > >> > > > > > > > > > could
>> > >> > > > > > > > > > > > do that. Alternatively, we could just let the
>> > admin
>> > >> > > > > configure a
>> > >> > > > > > > > high
>> > >> > > > > > > > > > > limit
>> > >> > > > > > > > > > > > for the kafka user (it may not be able to do
>> that
>> > >> > easily
>> > >> > > > > based
>> > >> > > > > > on
>> > >> > > > > > > > > > > clientId
>> > >> > > > > > > > > > > > though).
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > Ideally we want to be able to protect the
>> > >> utilization
>> > >> > of
>> > >> > > > the
>> > >> > > > > > > > network
>> > >> > > > > > > > > > > thread
>> > >> > > > > > > > > > > > pool too. The difficult is mostly what Rajini
>> > said:
>> > >> (1)
>> > >> > > The
>> > >> > > > > > > > mechanism
>> > >> > > > > > > > > > for
>> > >> > > > > > > > > > > > throttling the requests is through Purgatory
>> and
>> > we
>> > >> > will
>> > >> > > > have
>> > >> > > > > > to
>> > >> > > > > > > > > think
>> > >> > > > > > > > > > > > through how to integrate that into the network
>> > >> layer.
>> > >> > > (2)
>> > >> > > > In
>> > >> > > > > > the
>> > >> > > > > > > > > > network
>> > >> > > > > > > > > > > > layer, currently we know the user, but not the
>> > >> clientId
>> > >> > > of
>> > >> > > > > the
>> > >> > > > > > > > > request.
>> > >> > > > > > > > > > > So,
>> > >> > > > > > > > > > > > it's a bit tricky to throttle based on clientId
>> > >> there.
>> > >> > > > Plus,
>> > >> > > > > > the
>> > >> > > > > > > > > > byteOut
>> > >> > > > > > > > > > > > quota can already protect the network thread
>> > >> > utilization
>> > >> > > > for
>> > >> > > > > > > fetch
>> > >> > > > > > > > > > > > requests. So, if we can't figure out this part
>> > right
>> > >> > now,
>> > >> > > > > just
>> > >> > > > > > > > > focusing
>> > >> > > > > > > > > > > on
>> > >> > > > > > > > > > > > the request handling threads for this KIP is
>> > still a
>> > >> > > useful
>> > >> > > > > > > > feature.
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > Thanks,
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > Jun
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini
>> Sivaram <
>> > >> > > > > > > > > > rajinisivaram@gmail.com
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Thank you all for the feedback.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Jay: I have removed exemption for consumer
>> > >> heartbeat
>> > >> > > etc.
>> > >> > > > > > Agree
>> > >> > > > > > > > > that
>> > >> > > > > > > > > > > > > protecting the cluster is more important than
>> > >> > > protecting
>> > >> > > > > > > > individual
>> > >> > > > > > > > > > > apps.
>> > >> > > > > > > > > > > > > Have retained the exemption for
>> > >> > > StopReplicat/LeaderAndIsr
>> > >> > > > > > etc,
>> > >> > > > > > > > > these
>> > >> > > > > > > > > > > are
>> > >> > > > > > > > > > > > > throttled only if authorization fails (so
>> can't
>> > be
>> > >> > used
>> > >> > > > for
>> > >> > > > > > DoS
>> > >> > > > > > > > > > attacks
>> > >> > > > > > > > > > > > in
>> > >> > > > > > > > > > > > > a secure cluster, but allows inter-broker
>> > >> requests to
>> > >> > > > > > complete
>> > >> > > > > > > > > > without
>> > >> > > > > > > > > > > > > delays).
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > I will wait another day to see if these is
>> any
>> > >> > > objection
>> > >> > > > to
>> > >> > > > > > > > quotas
>> > >> > > > > > > > > > > based
>> > >> > > > > > > > > > > > on
>> > >> > > > > > > > > > > > > request processing time (as opposed to
>> request
>> > >> rate)
>> > >> > > and
>> > >> > > > if
>> > >> > > > > > > there
>> > >> > > > > > > > > are
>> > >> > > > > > > > > > > no
>> > >> > > > > > > > > > > > > objections, I will revert to the original
>> > proposal
>> > >> > with
>> > >> > > > > some
>> > >> > > > > > > > > changes.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > The original proposal was only including the
>> > time
>> > >> > used
>> > >> > > by
>> > >> > > > > the
>> > >> > > > > > > > > request
>> > >> > > > > > > > > > > > > handler threads (that made calculation
>> easy). I
>> > >> think
>> > >> > > the
>> > >> > > > > > > > > suggestion
>> > >> > > > > > > > > > is
>> > >> > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > include the time spent in the network
>> threads as
>> > >> well
>> > >> > > > since
>> > >> > > > > > > that
>> > >> > > > > > > > > may
>> > >> > > > > > > > > > be
>> > >> > > > > > > > > > > > > significant. As Jay pointed out, it is more
>> > >> > complicated
>> > >> > > > to
>> > >> > > > > > > > > calculate
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > > total available CPU time and convert to a
>> ratio
>> > >> when
>> > >> > > > there
>> > >> > > > > > *m*
>> > >> > > > > > > > I/O
>> > >> > > > > > > > > > > > threads
>> > >> > > > > > > > > > > > > and *n* network threads.
>> > >> > ThreadMXBean#getThreadCPUTime(
>> > >> > > )
>> > >> > > > > may
>> > >> > > > > > > > give
>> > >> > > > > > > > > us
>> > >> > > > > > > > > > > > what
>> > >> > > > > > > > > > > > > we want, but it can be very expensive on some
>> > >> > > platforms.
>> > >> > > > As
>> > >> > > > > > > > Becket
>> > >> > > > > > > > > > and
>> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have several
>> > time
>> > >> > > > > > measurements
>> > >> > > > > > > > > > already
>> > >> > > > > > > > > > > > for
>> > >> > > > > > > > > > > > > generating metrics that we could use, though
>> we
>> > >> might
>> > >> > > > want
>> > >> > > > > to
>> > >> > > > > > > > > switch
>> > >> > > > > > > > > > to
>> > >> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis()
>> since
>> > >> some
>> > >> > of
>> > >> > > > the
>> > >> > > > > > > > values
>> > >> > > > > > > > > > for
>> > >> > > > > > > > > > > > > small requests may be < 1ms. But rather than
>> add
>> > >> up
>> > >> > the
>> > >> > > > > time
>> > >> > > > > > > > spent
>> > >> > > > > > > > > in
>> > >> > > > > > > > > > > I/O
>> > >> > > > > > > > > > > > > thread and network thread, wouldn't it be
>> better
>> > >> to
>> > >> > > > convert
>> > >> > > > > > the
>> > >> > > > > > > > > time
>> > >> > > > > > > > > > > > spent
>> > >> > > > > > > > > > > > > on each thread into a separate ratio? UserA
>> has
>> > a
>> > >> > > request
>> > >> > > > > > quota
>> > >> > > > > > > > of
>> > >> > > > > > > > > > 5%.
>> > >> > > > > > > > > > > > Can
>> > >> > > > > > > > > > > > > we take that to mean that UserA can use 5% of
>> > the
>> > >> > time
>> > >> > > on
>> > >> > > > > > > network
>> > >> > > > > > > > > > > threads
>> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If either
>> is
>> > >> > > exceeded,
>> > >> > > > > the
>> > >> > > > > > > > > > response
>> > >> > > > > > > > > > > is
>> > >> > > > > > > > > > > > > throttled - it would mean maintaining two
>> sets
>> > of
>> > >> > > metrics
>> > >> > > > > for
>> > >> > > > > > > the
>> > >> > > > > > > > > two
>> > >> > > > > > > > > > > > > durations, but would result in more
>> meaningful
>> > >> > ratios.
>> > >> > > We
>> > >> > > > > > could
>> > >> > > > > > > > > > define
>> > >> > > > > > > > > > > > two
>> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request threads
>> > and
>> > >> 10%
>> > >> > > of
>> > >> > > > > > > network
>> > >> > > > > > > > > > > > threads),
>> > >> > > > > > > > > > > > > but that seems unnecessary and harder to
>> explain
>> > >> to
>> > >> > > > users.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Back to why and how quotas are applied to
>> > network
>> > >> > > thread
>> > >> > > > > > > > > utilization:
>> > >> > > > > > > > > > > > > a) In the case of fetch,  the time spent in
>> the
>> > >> > network
>> > >> > > > > > thread
>> > >> > > > > > > > may
>> > >> > > > > > > > > be
>> > >> > > > > > > > > > > > > significant and I can see the need to include
>> > >> this.
>> > >> > Are
>> > >> > > > > there
>> > >> > > > > > > > other
>> > >> > > > > > > > > > > > > requests where the network thread
>> utilization is
>> > >> > > > > significant?
>> > >> > > > > > > In
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > case
>> > >> > > > > > > > > > > > > of fetch, request handler thread utilization
>> > would
>> > >> > > > throttle
>> > >> > > > > > > > clients
>> > >> > > > > > > > > > > with
>> > >> > > > > > > > > > > > > high request rate, low data volume and fetch
>> > byte
>> > >> > rate
>> > >> > > > > quota
>> > >> > > > > > > will
>> > >> > > > > > > > > > > > throttle
>> > >> > > > > > > > > > > > > clients with high data volume. Network thread
>> > >> > > utilization
>> > >> > > > > is
>> > >> > > > > > > > > perhaps
>> > >> > > > > > > > > > > > > proportional to the data volume. I am
>> wondering
>> > >> if we
>> > >> > > > even
>> > >> > > > > > need
>> > >> > > > > > > > to
>> > >> > > > > > > > > > > > throttle
>> > >> > > > > > > > > > > > > based on network thread utilization or
>> whether
>> > the
>> > >> > data
>> > >> > > > > > volume
>> > >> > > > > > > > > quota
>> > >> > > > > > > > > > > > covers
>> > >> > > > > > > > > > > > > this case.
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > b) At the moment, we record and check for
>> quota
>> > >> > > violation
>> > >> > > > > at
>> > >> > > > > > > the
>> > >> > > > > > > > > same
>> > >> > > > > > > > > > > > time.
>> > >> > > > > > > > > > > > > If a quota is violated, the response is
>> delayed.
>> > >> > Using
>> > >> > > > > Jay'e
>> > >> > > > > > > > > example
>> > >> > > > > > > > > > of
>> > >> > > > > > > > > > > > > disk reads for fetches happening in the
>> network
>> > >> > thread,
>> > >> > > > We
>> > >> > > > > > > can't
>> > >> > > > > > > > > > record
>> > >> > > > > > > > > > > > and
>> > >> > > > > > > > > > > > > delay a response after the disk reads. We
>> could
>> > >> > record
>> > >> > > > the
>> > >> > > > > > time
>> > >> > > > > > > > > spent
>> > >> > > > > > > > > > > on
>> > >> > > > > > > > > > > > > the network thread when the response is
>> complete
>> > >> and
>> > >> > > > > > introduce
>> > >> > > > > > > a
>> > >> > > > > > > > > > delay
>> > >> > > > > > > > > > > > for
>> > >> > > > > > > > > > > > > handling a subsequent request (separate out
>> > >> recording
>> > >> > > and
>> > >> > > > > > quota
>> > >> > > > > > > > > > > violation
>> > >> > > > > > > > > > > > > handling in the case of network thread
>> > overload).
>> > >> > Does
>> > >> > > > that
>> > >> > > > > > > make
>> > >> > > > > > > > > > sense?
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Regards,
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > Rajini
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
>> > >> > > > > > > > becket.qin@gmail.com>
>> > >> > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > Hey Jay,
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU time
>> is a
>> > >> > little
>> > >> > > > > > > tricky. I
>> > >> > > > > > > > > am
>> > >> > > > > > > > > > > > > thinking
>> > >> > > > > > > > > > > > > > that maybe we can use the existing request
>> > >> > > statistics.
>> > >> > > > > They
>> > >> > > > > > > are
>> > >> > > > > > > > > > > already
>> > >> > > > > > > > > > > > > > very detailed so we can probably see the
>> > >> > approximate
>> > >> > > > CPU
>> > >> > > > > > time
>> > >> > > > > > > > > from
>> > >> > > > > > > > > > > it,
>> > >> > > > > > > > > > > > > e.g.
>> > >> > > > > > > > > > > > > > something like (total_time -
>> > >> > > > request/response_queue_time
>> > >> > > > > -
>> > >> > > > > > > > > > > > remote_time).
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > I agree with Guozhang that when a user is
>> > >> throttled
>> > >> > > it
>> > >> > > > is
>> > >> > > > > > > > likely
>> > >> > > > > > > > > > that
>> > >> > > > > > > > > > > > we
>> > >> > > > > > > > > > > > > > need to see if anything has went wrong
>> first,
>> > >> and
>> > >> > if
>> > >> > > > the
>> > >> > > > > > > users
>> > >> > > > > > > > > are
>> > >> > > > > > > > > > > well
>> > >> > > > > > > > > > > > > > behaving and just need more resources, we
>> will
>> > >> have
>> > >> > > to
>> > >> > > > > bump
>> > >> > > > > > > up
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > > quota
>> > >> > > > > > > > > > > > > > for them. It is true that pre-allocating
>> CPU
>> > >> time
>> > >> > > quota
>> > >> > > > > > > > precisely
>> > >> > > > > > > > > > for
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > users is difficult. So in practice it would
>> > >> > probably
>> > >> > > be
>> > >> > > > > > more
>> > >> > > > > > > > like
>> > >> > > > > > > > > > > first
>> > >> > > > > > > > > > > > > set
>> > >> > > > > > > > > > > > > > a relative high protective CPU time quota
>> for
>> > >> > > everyone
>> > >> > > > > and
>> > >> > > > > > > > > increase
>> > >> > > > > > > > > > > > that
>> > >> > > > > > > > > > > > > > for some individual clients on demand.
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > Thanks,
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang
>> > Wang <
>> > >> > > > > > > > > wangguoz@gmail.com
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > This is a great proposal, glad to see it
>> > >> > happening.
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > I am inclined to the CPU throttling, or
>> more
>> > >> > > > > specifically
>> > >> > > > > > > > > > > processing
>> > >> > > > > > > > > > > > > time
>> > >> > > > > > > > > > > > > > > ratio instead of the request rate
>> throttling
>> > >> as
>> > >> > > well.
>> > >> > > > > > > Becket
>> > >> > > > > > > > > has
>> > >> > > > > > > > > > > very
>> > >> > > > > > > > > > > > > > well
>> > >> > > > > > > > > > > > > > > summed my rationales above, and one
>> thing to
>> > >> add
>> > >> > > here
>> > >> > > > > is
>> > >> > > > > > > that
>> > >> > > > > > > > > the
>> > >> > > > > > > > > > > > > former
>> > >> > > > > > > > > > > > > > > has a good support for both "protecting
>> > >> against
>> > >> > > rogue
>> > >> > > > > > > > clients"
>> > >> > > > > > > > > as
>> > >> > > > > > > > > > > > well
>> > >> > > > > > > > > > > > > as
>> > >> > > > > > > > > > > > > > > "utilizing a cluster for multi-tenancy
>> > usage":
>> > >> > when
>> > >> > > > > > > thinking
>> > >> > > > > > > > > > about
>> > >> > > > > > > > > > > > how
>> > >> > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > explain this to the end users, I find it
>> > >> actually
>> > >> > > > more
>> > >> > > > > > > > natural
>> > >> > > > > > > > > > than
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > request rate since as mentioned above,
>> > >> different
>> > >> > > > > requests
>> > >> > > > > > > > will
>> > >> > > > > > > > > > have
>> > >> > > > > > > > > > > > > quite
>> > >> > > > > > > > > > > > > > > different "cost", and Kafka today already
>> > have
>> > >> > > > various
>> > >> > > > > > > > request
>> > >> > > > > > > > > > > types
>> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata, etc),
>> > >> because
>> > >> > of
>> > >> > > > that
>> > >> > > > > > the
>> > >> > > > > > > > > > request
>> > >> > > > > > > > > > > > > rate
>> > >> > > > > > > > > > > > > > > throttling may not be as effective
>> unless it
>> > >> is
>> > >> > set
>> > >> > > > > very
>> > >> > > > > > > > > > > > > conservatively.
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > Regarding to user reactions when they are
>> > >> > > throttled,
>> > >> > > > I
>> > >> > > > > > > think
>> > >> > > > > > > > it
>> > >> > > > > > > > > > may
>> > >> > > > > > > > > > > > > > differ
>> > >> > > > > > > > > > > > > > > case-by-case, and need to be discovered /
>> > >> guided
>> > >> > by
>> > >> > > > > > looking
>> > >> > > > > > > > at
>> > >> > > > > > > > > > > > relative
>> > >> > > > > > > > > > > > > > > metrics. So in other words users would
>> not
>> > >> expect
>> > >> > > to
>> > >> > > > > get
>> > >> > > > > > > > > > additional
>> > >> > > > > > > > > > > > > > > information by simply being told "hey,
>> you
>> > are
>> > >> > > > > > throttled",
>> > >> > > > > > > > > which
>> > >> > > > > > > > > > is
>> > >> > > > > > > > > > > > all
>> > >> > > > > > > > > > > > > > > what throttling does; they need to take a
>> > >> > follow-up
>> > >> > > > > step
>> > >> > > > > > > and
>> > >> > > > > > > > > see
>> > >> > > > > > > > > > > > "hmm,
>> > >> > > > > > > > > > > > > > I'm
>> > >> > > > > > > > > > > > > > > throttled probably because of ..", which
>> is
>> > by
>> > >> > > > looking
>> > >> > > > > at
>> > >> > > > > > > > other
>> > >> > > > > > > > > > > > metric
>> > >> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding the
>> > >> brokers
>> > >> > > with
>> > >> > > > > > > metadata
>> > >> > > > > > > > > > > > request,
>> > >> > > > > > > > > > > > > > > which are usually cheap to handle but I'm
>> > >> sending
>> > >> > > > > > thousands
>> > >> > > > > > > > per
>> > >> > > > > > > > > > > > second;
>> > >> > > > > > > > > > > > > > or
>> > >> > > > > > > > > > > > > > > is it because I'm catching up and hence
>> > >> sending
>> > >> > > very
>> > >> > > > > > heavy
>> > >> > > > > > > > > > fetching
>> > >> > > > > > > > > > > > > > request
>> > >> > > > > > > > > > > > > > > with large min.bytes, etc.
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > Regarding to the implementation, as once
>> > >> > discussed
>> > >> > > > with
>> > >> > > > > > > Jun,
>> > >> > > > > > > > > this
>> > >> > > > > > > > > > > > seems
>> > >> > > > > > > > > > > > > > not
>> > >> > > > > > > > > > > > > > > very difficult since today we are already
>> > >> > > collecting
>> > >> > > > > the
>> > >> > > > > > > > > "thread
>> > >> > > > > > > > > > > pool
>> > >> > > > > > > > > > > > > > > utilization" metrics, which is a single
>> > >> > percentage
>> > >> > > > > > > > > > > > "aggregateIdleMeter"
>> > >> > > > > > > > > > > > > > > value; but we are already effectively
>> > >> aggregating
>> > >> > > it
>> > >> > > > > for
>> > >> > > > > > > each
>> > >> > > > > > > > > > > > requests
>> > >> > > > > > > > > > > > > in
>> > >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just
>> extend
>> > >> it by
>> > >> > > > > > recording
>> > >> > > > > > > > the
>> > >> > > > > > > > > > > > source
>> > >> > > > > > > > > > > > > > > client id when handling them and
>> aggregating
>> > >> by
>> > >> > > > > clientId
>> > >> > > > > > as
>> > >> > > > > > > > > well
>> > >> > > > > > > > > > as
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > total aggregate.
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > Guozhang
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay
>> Kreps <
>> > >> > > > > > > jay@confluent.io
>> > >> > > > > > > > >
>> > >> > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > When I thought about it more deeply I
>> came
>> > >> > around
>> > >> > > > to
>> > >> > > > > > the
>> > >> > > > > > > > > > "percent
>> > >> > > > > > > > > > > > of
>> > >> > > > > > > > > > > > > > > > processing time" metric too. It seems a
>> > lot
>> > >> > > closer
>> > >> > > > to
>> > >> > > > > > the
>> > >> > > > > > > > > thing
>> > >> > > > > > > > > > > we
>> > >> > > > > > > > > > > > > > > actually
>> > >> > > > > > > > > > > > > > > > care about and need to protect. I also
>> > think
>> > >> > this
>> > >> > > > > would
>> > >> > > > > > > be
>> > >> > > > > > > > a
>> > >> > > > > > > > > > very
>> > >> > > > > > > > > > > > > > useful
>> > >> > > > > > > > > > > > > > > > metric even in the absence of
>> throttling
>> > >> just
>> > >> > to
>> > >> > > > > debug
>> > >> > > > > > > > whose
>> > >> > > > > > > > > > > using
>> > >> > > > > > > > > > > > > > > > capacity.
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > Two problems to consider:
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > >    1. I agree that for the user it is
>> > >> > > > understandable
>> > >> > > > > > what
>> > >> > > > > > > > > lead
>> > >> > > > > > > > > > to
>> > >> > > > > > > > > > > > > their
>> > >> > > > > > > > > > > > > > > >    being throttled, but it is a bit
>> hard
>> > to
>> > >> > > figure
>> > >> > > > > out
>> > >> > > > > > > the
>> > >> > > > > > > > > safe
>> > >> > > > > > > > > > > > range
>> > >> > > > > > > > > > > > > > for
>> > >> > > > > > > > > > > > > > > >    them. i.e. if I have a new app that
>> > will
>> > >> > send
>> > >> > > > 200
>> > >> > > > > > > > > > > messages/sec I
>> > >> > > > > > > > > > > > > can
>> > >> > > > > > > > > > > > > > > >    probably reason that I'll be under
>> the
>> > >> > > > throttling
>> > >> > > > > > > limit
>> > >> > > > > > > > of
>> > >> > > > > > > > > > 300
>> > >> > > > > > > > > > > > > > > req/sec.
>> > >> > > > > > > > > > > > > > > >    However if I need to be under a 10%
>> CPU
>> > >> > > > resources
>> > >> > > > > > > limit
>> > >> > > > > > > > it
>> > >> > > > > > > > > > may
>> > >> > > > > > > > > > > > be
>> > >> > > > > > > > > > > > > a
>> > >> > > > > > > > > > > > > > > bit
>> > >> > > > > > > > > > > > > > > >    harder for me to know a priori if i
>> > will
>> > >> or
>> > >> > > > won't.
>> > >> > > > > > > > > > > > > > > >    2. Calculating the available CPU
>> time
>> > is
>> > >> a
>> > >> > bit
>> > >> > > > > > > difficult
>> > >> > > > > > > > > > since
>> > >> > > > > > > > > > > > > there
>> > >> > > > > > > > > > > > > > > are
>> > >> > > > > > > > > > > > > > > >    actually two thread pools--the I/O
>> > >> threads
>> > >> > and
>> > >> > > > the
>> > >> > > > > > > > network
>> > >> > > > > > > > > > > > > threads.
>> > >> > > > > > > > > > > > > > I
>> > >> > > > > > > > > > > > > > > > think
>> > >> > > > > > > > > > > > > > > >    it might be workable to count just
>> the
>> > >> I/O
>> > >> > > > thread
>> > >> > > > > > time
>> > >> > > > > > > > as
>> > >> > > > > > > > > in
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > > proposal,
>> > >> > > > > > > > > > > > > > > >    but the network thread work is
>> actually
>> > >> > > > > non-trivial
>> > >> > > > > > > > (e.g.
>> > >> > > > > > > > > > all
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > disk
>> > >> > > > > > > > > > > > > > > >    reads for fetches happen in that
>> > >> thread). If
>> > >> > > you
>> > >> > > > > > count
>> > >> > > > > > > > > both
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > network
>> > >> > > > > > > > > > > > > > > > and
>> > >> > > > > > > > > > > > > > > >    I/O threads it can skew things a
>> bit.
>> > >> E.g.
>> > >> > say
>> > >> > > > you
>> > >> > > > > > > have
>> > >> > > > > > > > 50
>> > >> > > > > > > > > > > > network
>> > >> > > > > > > > > > > > > > > > threads,
>> > >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores, what is
>> > the
>> > >> > > > available
>> > >> > > > > > cpu
>> > >> > > > > > > > > time
>> > >> > > > > > > > > > > > > > available
>> > >> > > > > > > > > > > > > > > > in a
>> > >> > > > > > > > > > > > > > > >    second? I suppose this is a problem
>> > >> whenever
>> > >> > > you
>> > >> > > > > > have
>> > >> > > > > > > a
>> > >> > > > > > > > > > > > bottleneck
>> > >> > > > > > > > > > > > > > > > between
>> > >> > > > > > > > > > > > > > > >    I/O and network threads or if you
>> end
>> > up
>> > >> > > > > > significantly
>> > >> > > > > > > > > > > > > > > over-provisioning
>> > >> > > > > > > > > > > > > > > >    one pool (both of which are hard to
>> > >> avoid).
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > An alternative for CPU throttling
>> would be
>> > >> to
>> > >> > use
>> > >> > > > > this
>> > >> > > > > > > api:
>> > >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
>> > >> > > > > > 1.5.0/docs/api/java/lang/
>> > >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
>> > >> > > > getThreadCpuTime(long)
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > That would let you track actual CPU
>> usage
>> > >> > across
>> > >> > > > the
>> > >> > > > > > > > network,
>> > >> > > > > > > > > > I/O
>> > >> > > > > > > > > > > > > > > threads,
>> > >> > > > > > > > > > > > > > > > and purgatory threads and look at it
>> as a
>> > >> > > > percentage
>> > >> > > > > of
>> > >> > > > > > > > total
>> > >> > > > > > > > > > > > cores.
>> > >> > > > > > > > > > > > > I
>> > >> > > > > > > > > > > > > > > > think this fixes many problems in the
>> > >> > reliability
>> > >> > > > of
>> > >> > > > > > the
>> > >> > > > > > > > > > metric.
>> > >> > > > > > > > > > > > It's
>> > >> > > > > > > > > > > > > > > > meaning is slightly different as it is
>> > just
>> > >> CPU
>> > >> > > > (you
>> > >> > > > > > > don't
>> > >> > > > > > > > > get
>> > >> > > > > > > > > > > > > charged
>> > >> > > > > > > > > > > > > > > for
>> > >> > > > > > > > > > > > > > > > time blocking on I/O) but that may be
>> okay
>> > >> > > because
>> > >> > > > we
>> > >> > > > > > > > already
>> > >> > > > > > > > > > > have
>> > >> > > > > > > > > > > > a
>> > >> > > > > > > > > > > > > > > > throttle on I/O. The downside is I
>> think
>> > it
>> > >> is
>> > >> > > > > possible
>> > >> > > > > > > > this
>> > >> > > > > > > > > > api
>> > >> > > > > > > > > > > > can
>> > >> > > > > > > > > > > > > be
>> > >> > > > > > > > > > > > > > > > disabled or isn't always available and
>> it
>> > >> may
>> > >> > > also
>> > >> > > > be
>> > >> > > > > > > > > expensive
>> > >> > > > > > > > > > > > (also
>> > >> > > > > > > > > > > > > > > I've
>> > >> > > > > > > > > > > > > > > > never used it so not sure if it really
>> > works
>> > >> > the
>> > >> > > > way
>> > >> > > > > i
>> > >> > > > > > > > > think).
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > -Jay
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket
>> > Qin
>> > >> <
>> > >> > > > > > > > > > > becket.qin@gmail.com>
>> > >> > > > > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > If the purpose of the KIP is only to
>> > >> protect
>> > >> > > the
>> > >> > > > > > > cluster
>> > >> > > > > > > > > from
>> > >> > > > > > > > > > > > being
>> > >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and is
>> not
>> > >> > > intended
>> > >> > > > to
>> > >> > > > > > > > address
>> > >> > > > > > > > > > > > > resource
>> > >> > > > > > > > > > > > > > > > > allocation problem among the
>> clients, I
>> > am
>> > >> > > > > wondering
>> > >> > > > > > if
>> > >> > > > > > > > > using
>> > >> > > > > > > > > > > > > request
>> > >> > > > > > > > > > > > > > > > > handling time quota (CPU time quota)
>> is
>> > a
>> > >> > > better
>> > >> > > > > > > option.
>> > >> > > > > > > > > Here
>> > >> > > > > > > > > > > are
>> > >> > > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > > > reasons:
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > 1. request handling time quota has
>> > better
>> > >> > > > > protection.
>> > >> > > > > > > Say
>> > >> > > > > > > > > we
>> > >> > > > > > > > > > > have
>> > >> > > > > > > > > > > > > > > request
>> > >> > > > > > > > > > > > > > > > > rate quota and set that to some value
>> > like
>> > >> > 100
>> > >> > > > > > > > > requests/sec,
>> > >> > > > > > > > > > it
>> > >> > > > > > > > > > > > is
>> > >> > > > > > > > > > > > > > > > possible
>> > >> > > > > > > > > > > > > > > > > that some of the requests are very
>> > >> expensive
>> > >> > > > > actually
>> > >> > > > > > > > take
>> > >> > > > > > > > > a
>> > >> > > > > > > > > > > lot
>> > >> > > > > > > > > > > > of
>> > >> > > > > > > > > > > > > > > time
>> > >> > > > > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > > handle. In that case a few clients
>> may
>> > >> still
>> > >> > > > > occupy a
>> > >> > > > > > > lot
>> > >> > > > > > > > > of
>> > >> > > > > > > > > > > CPU
>> > >> > > > > > > > > > > > > time
>> > >> > > > > > > > > > > > > > > > even
>> > >> > > > > > > > > > > > > > > > > the request rate is low. Arguably we
>> can
>> > >> > > > carefully
>> > >> > > > > > set
>> > >> > > > > > > > > > request
>> > >> > > > > > > > > > > > rate
>> > >> > > > > > > > > > > > > > > quota
>> > >> > > > > > > > > > > > > > > > > for each request and client id
>> > >> combination,
>> > >> > but
>> > >> > > > it
>> > >> > > > > > > could
>> > >> > > > > > > > > > still
>> > >> > > > > > > > > > > be
>> > >> > > > > > > > > > > > > > > tricky
>> > >> > > > > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > > get it right for everyone.
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > If we use the request time handling
>> > >> quota, we
>> > >> > > can
>> > >> > > > > > > simply
>> > >> > > > > > > > > say
>> > >> > > > > > > > > > no
>> > >> > > > > > > > > > > > > > clients
>> > >> > > > > > > > > > > > > > > > can
>> > >> > > > > > > > > > > > > > > > > take up to more than 30% of the total
>> > >> request
>> > >> > > > > > handling
>> > >> > > > > > > > > > capacity
>> > >> > > > > > > > > > > > > > > (measured
>> > >> > > > > > > > > > > > > > > > > by time), regardless of the
>> difference
>> > >> among
>> > >> > > > > > different
>> > >> > > > > > > > > > requests
>> > >> > > > > > > > > > > > or
>> > >> > > > > > > > > > > > > > what
>> > >> > > > > > > > > > > > > > > > is
>> > >> > > > > > > > > > > > > > > > > the client doing. In this case maybe
>> we
>> > >> can
>> > >> > > quota
>> > >> > > > > all
>> > >> > > > > > > the
>> > >> > > > > > > > > > > > requests
>> > >> > > > > > > > > > > > > if
>> > >> > > > > > > > > > > > > > > we
>> > >> > > > > > > > > > > > > > > > > want to.
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > 2. The main benefit of using request
>> > rate
>> > >> > limit
>> > >> > > > is
>> > >> > > > > > that
>> > >> > > > > > > > it
>> > >> > > > > > > > > > > seems
>> > >> > > > > > > > > > > > > more
>> > >> > > > > > > > > > > > > > > > > intuitive. It is true that it is
>> > probably
>> > >> > > easier
>> > >> > > > to
>> > >> > > > > > > > explain
>> > >> > > > > > > > > > to
>> > >> > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > user
>> > >> > > > > > > > > > > > > > > > > what does that mean. However, in
>> > practice
>> > >> it
>> > >> > > > looks
>> > >> > > > > > the
>> > >> > > > > > > > > impact
>> > >> > > > > > > > > > > of
>> > >> > > > > > > > > > > > > > > request
>> > >> > > > > > > > > > > > > > > > > rate quota is not more quantifiable
>> than
>> > >> the
>> > >> > > > > request
>> > >> > > > > > > > > handling
>> > >> > > > > > > > > > > > time
>> > >> > > > > > > > > > > > > > > quota.
>> > >> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is
>> still
>> > >> > > difficult
>> > >> > > > > to
>> > >> > > > > > > > give a
>> > >> > > > > > > > > > > > number
>> > >> > > > > > > > > > > > > > > about
>> > >> > > > > > > > > > > > > > > > > impact of throughput or latency when
>> a
>> > >> > request
>> > >> > > > rate
>> > >> > > > > > > quota
>> > >> > > > > > > > > is
>> > >> > > > > > > > > > > hit.
>> > >> > > > > > > > > > > > > So
>> > >> > > > > > > > > > > > > > it
>> > >> > > > > > > > > > > > > > > > is
>> > >> > > > > > > > > > > > > > > > > not better than the request handling
>> > time
>> > >> > > quota.
>> > >> > > > In
>> > >> > > > > > > fact
>> > >> > > > > > > > I
>> > >> > > > > > > > > > feel
>> > >> > > > > > > > > > > > it
>> > >> > > > > > > > > > > > > is
>> > >> > > > > > > > > > > > > > > > > clearer to tell user that "you are
>> > limited
>> > >> > > > because
>> > >> > > > > > you
>> > >> > > > > > > > have
>> > >> > > > > > > > > > > taken
>> > >> > > > > > > > > > > > > 30%
>> > >> > > > > > > > > > > > > > > of
>> > >> > > > > > > > > > > > > > > > > the CPU time on the broker" than
>> > otherwise
>> > >> > > > > something
>> > >> > > > > > > like
>> > >> > > > > > > > > > "your
>> > >> > > > > > > > > > > > > > request
>> > >> > > > > > > > > > > > > > > > > rate quota on metadata request has
>> > >> reached".
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > Thanks,
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay
>> > >> Kreps <
>> > >> > > > > > > > > jay@confluent.io
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > I think this proposal makes a lot
>> of
>> > >> sense
>> > >> > > > > > > (especially
>> > >> > > > > > > > > now
>> > >> > > > > > > > > > > that
>> > >> > > > > > > > > > > > > it
>> > >> > > > > > > > > > > > > > is
>> > >> > > > > > > > > > > > > > > > > > oriented around request rate) and
>> > fills
>> > >> the
>> > >> > > > > biggest
>> > >> > > > > > > > > > remaining
>> > >> > > > > > > > > > > > gap
>> > >> > > > > > > > > > > > > > in
>> > >> > > > > > > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > > > > multi-tenancy story.
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > I think for intra-cluster
>> > communication
>> > >> > > > > > (StopReplica,
>> > >> > > > > > > > > etc)
>> > >> > > > > > > > > > we
>> > >> > > > > > > > > > > > > could
>> > >> > > > > > > > > > > > > > > > avoid
>> > >> > > > > > > > > > > > > > > > > > throttling entirely. You can
>> secure or
>> > >> > > > otherwise
>> > >> > > > > > > > > lock-down
>> > >> > > > > > > > > > > the
>> > >> > > > > > > > > > > > > > > cluster
>> > >> > > > > > > > > > > > > > > > > > communication to avoid any
>> > unauthorized
>> > >> > > > external
>> > >> > > > > > > party
>> > >> > > > > > > > > from
>> > >> > > > > > > > > > > > > trying
>> > >> > > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > > > initiate these requests. As a
>> result
>> > we
>> > >> are
>> > >> > > as
>> > >> > > > > > likely
>> > >> > > > > > > > to
>> > >> > > > > > > > > > > cause
>> > >> > > > > > > > > > > > > > > problems
>> > >> > > > > > > > > > > > > > > > > as
>> > >> > > > > > > > > > > > > > > > > > solve them by throttling these,
>> right?
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > I'm not so sure that we should
>> exempt
>> > >> the
>> > >> > > > > consumer
>> > >> > > > > > > > > requests
>> > >> > > > > > > > > > > > such
>> > >> > > > > > > > > > > > > as
>> > >> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we
>> > >> throttle an
>> > >> > > > app's
>> > >> > > > > > > > > heartbeat
>> > >> > > > > > > > > > > > > > requests
>> > >> > > > > > > > > > > > > > > it
>> > >> > > > > > > > > > > > > > > > > may
>> > >> > > > > > > > > > > > > > > > > > cause it to fall out of its
>> consumer
>> > >> group.
>> > >> > > > > However
>> > >> > > > > > > if
>> > >> > > > > > > > we
>> > >> > > > > > > > > > > don't
>> > >> > > > > > > > > > > > > > > > throttle
>> > >> > > > > > > > > > > > > > > > > it
>> > >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the
>> > heartbeat
>> > >> > > > interval
>> > >> > > > > > is
>> > >> > > > > > > > set
>> > >> > > > > > > > > > > > > > incorrectly
>> > >> > > > > > > > > > > > > > > or
>> > >> > > > > > > > > > > > > > > > > if
>> > >> > > > > > > > > > > > > > > > > > some client in some language has a
>> > bug.
>> > >> I
>> > >> > > think
>> > >> > > > > the
>> > >> > > > > > > > > policy
>> > >> > > > > > > > > > > with
>> > >> > > > > > > > > > > > > > this
>> > >> > > > > > > > > > > > > > > > kind
>> > >> > > > > > > > > > > > > > > > > > of throttling is to protect the
>> > cluster
>> > >> > above
>> > >> > > > any
>> > >> > > > > > > > > > individual
>> > >> > > > > > > > > > > > app,
>> > >> > > > > > > > > > > > > > > > right?
>> > >> > > > > > > > > > > > > > > > > I
>> > >> > > > > > > > > > > > > > > > > > think in general this should be
>> okay
>> > >> since
>> > >> > > for
>> > >> > > > > most
>> > >> > > > > > > > > > > deployments
>> > >> > > > > > > > > > > > > > this
>> > >> > > > > > > > > > > > > > > > > > setting is meant as more of a
>> safety
>> > >> > > > valve---that
>> > >> > > > > > is
>> > >> > > > > > > > > rather
>> > >> > > > > > > > > > > > than
>> > >> > > > > > > > > > > > > > set
>> > >> > > > > > > > > > > > > > > > > > something very close to what you
>> > expect
>> > >> to
>> > >> > > need
>> > >> > > > > > (say
>> > >> > > > > > > 2
>> > >> > > > > > > > > > > req/sec
>> > >> > > > > > > > > > > > or
>> > >> > > > > > > > > > > > > > > > > whatever)
>> > >> > > > > > > > > > > > > > > > > > you would have something quite high
>> > >> (like
>> > >> > 100
>> > >> > > > > > > req/sec)
>> > >> > > > > > > > > with
>> > >> > > > > > > > > > > > this
>> > >> > > > > > > > > > > > > > > meant
>> > >> > > > > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I
>> think
>> > >> when
>> > >> > > used
>> > >> > > > > this
>> > >> > > > > > > way
>> > >> > > > > > > > > > > > allowing
>> > >> > > > > > > > > > > > > > > those
>> > >> > > > > > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > > > be throttled would actually provide
>> > >> > > meaningful
>> > >> > > > > > > > > protection.
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > -Jay
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM,
>> > Rajini
>> > >> > > > Sivaram <
>> > >> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > wrote:
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > Hi all,
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > I have just created KIP-124 to
>> > >> introduce
>> > >> > > > > request
>> > >> > > > > > > rate
>> > >> > > > > > > > > > > quotas
>> > >> > > > > > > > > > > > to
>> > >> > > > > > > > > > > > > > > > Kafka:
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
>> > >> > > > > > > > confluence/display/KAFKA/KIP-
>> > >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > The proposal is for a simple
>> > >> percentage
>> > >> > > > request
>> > >> > > > > > > > > handling
>> > >> > > > > > > > > > > time
>> > >> > > > > > > > > > > > > > quota
>> > >> > > > > > > > > > > > > > > > > that
>> > >> > > > > > > > > > > > > > > > > > > can be allocated to
>> *<client-id>*,
>> > >> > *<user>*
>> > >> > > > or
>> > >> > > > > > > > *<user,
>> > >> > > > > > > > > > > > > > client-id>*.
>> > >> > > > > > > > > > > > > > > > > There
>> > >> > > > > > > > > > > > > > > > > > > are a few other suggestions also
>> > under
>> > >> > > > > "Rejected
>> > >> > > > > > > > > > > > alternatives".
>> > >> > > > > > > > > > > > > > > > > Feedback
>> > >> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > Thank you...
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > Regards,
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > > > Rajini
>> > >> > > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > > > --
>> > >> > > > > > > > > > > > > > > -- Guozhang
>> > >> > > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > > >
>> > >> > > > > > > > > > > > >
>> > >> > > > > > > > > > > >
>> > >> > > > > > > > > > >
>> > >> > > > > > > > > >
>> > >> > > > > > > > >
>> > >> > > > > > > >
>> > >> > > > > > >
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > >
>> > >> > > > > > --
>> > >> > > > > > -- Guozhang
>> > >> > > > > >
>> > >> > > > >
>> > >> > > >
>> > >> > >
>> > >> >
>> > >>
>> > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Thank you all for the feedback.

Ismael #1. It makes sense not to throttle inter-broker requests like
LeaderAndIsr etc. The simplest way to ensure that clients cannot use these
requests to bypass quotas for DoS attacks is to ensure that ACLs prevent
clients from using these requests and unauthorized requests are included
towards quotas.

Ismael #2, Jay #1 : I was thinking that these quotas can return a separate
throttle time, and all utilization based quotas could use the same field
(we won't add another one for network thread utilization for instance). But
perhaps it makes sense to keep byte rate quotas separate in produce/fetch
responses to provide separate metrics? Agree with Ismael that the name of
the existing field should be changed if we have two. Happy to switch to a
single combined throttle time if that is sufficient.

Ismael #4, #5, #6: Will update KIP. Will use dot separated name for new
property. Replication quotas use dot separated, so it will be consistent
with all properties except byte rate quotas.

Radai: #1 Request processing time rather than request rate were chosen
because the time per request can vary significantly between requests as
mentioned in the discussion and KIP.
#2 Two separate quotas for heartbeats/regular requests feel like more
configuration and more metrics. Since most users would set quotas higher
than the expected usage and quotas are more of a safety net, a single quota
should work in most cases.
 #3 The number of requests in purgatory is limited by the number of active
connections since only one request per connection will be throttled at a
time.
#4 As with byte rate quotas, to use the full allocated quotas,
clients/users would need to use partitions that are distributed across the
cluster. The alternative of using cluster-wide quotas instead of per-broker
quotas would be far too complex to implement.

Dong : We currently have two ClientQuotaManagers for quota types Fetch and
Produce. A new one will be added for IOThread, which manages quotas for I/O
thread utilization. This will not update the Fetch or Produce queue-size,
but will have a separate metric for the queue-size.  I wasn't planning to
add any additional metrics apart from the equivalent ones for existing
quotas as part of this KIP. Ratio of byte-rate to I/O thread utilization
could be slightly misleading since it depends on the sequence of requests.
But we can look into more metrics after the KIP is implemented if required.

I think we need to limit the maximum delay since all requests are
throttled. If a client has a quota of 0.001 units and a single request used
50ms, we don't want to delay all requests from the client by 50 seconds,
throwing the client out of all its consumer groups. The issue is only if a
user is allocated a quota that is insufficient to process one large
request. The expectation is that the units allocated per user will be much
higher than the time taken to process one request and the limit should
seldom be applied. Agree this needs proper documentation.

Regards,

Rajini


On Thu, Feb 23, 2017 at 8:04 PM, radai <ra...@gmail.com> wrote:

> @jun: i wasnt concerned about tying up a request processing thread, but
> IIUC the code does still read the entire request out, which might add-up to
> a non-negligible amount of memory.
>
> On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com> wrote:
>
> > Hey Rajini,
> >
> > The current KIP says that the maximum delay will be reduced to window
> size
> > if it is larger than the window size. I have a concern with this:
> >
> > 1) This essentially means that the user is allowed to exceed their quota
> > over a long period of time. Can you provide an upper bound on this
> > deviation?
> >
> > 2) What is the motivation for cap the maximum delay by the window size? I
> > am wondering if there is better alternative to address the problem.
> >
> > 3) It means that the existing metric-related config will have a more
> > directly impact on the mechanism of this io-thread-unit-based quota. The
> > may be an important change depending on the answer to 1) above. We
> probably
> > need to document this more explicitly.
> >
> > Dong
> >
> >
> > On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <li...@gmail.com> wrote:
> >
> > > Hey Jun,
> > >
> > > Yeah you are right. I thought it wasn't because at LinkedIn it will be
> > too
> > > much pressure on inGraph to expose those per-clientId metrics so we
> ended
> > > up printing them periodically to local log. Never mind if it is not a
> > > general problem.
> > >
> > > Hey Rajini,
> > >
> > > - I agree with Jay that we probably don't want to add a new field for
> > > every quota ProduceResponse or FetchResponse. Is there any use-case for
> > > having separate throttle-time fields for byte-rate-quota and
> > > io-thread-unit-quota? You probably need to document this as interface
> > > change if you plan to add new field in any request.
> > >
> > > - I don't think IOThread belongs to quotaType. The existing quota types
> > > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication) identify
> the
> > > type of request that are throttled, not the quota mechanism that is
> > applied.
> > >
> > > - If a request is throttled due to this io-thread-unit-based quota, is
> > the
> > > existing queue-size metric in ClientQuotaManager incremented?
> > >
> > > - In the interest of providing guide line for admin to decide
> > > io-thread-unit-based quota and for user to understand its impact on
> their
> > > traffic, would it be useful to have a metric that shows the overall
> > > byte-rate per io-thread-unit? Can we also show this a per-clientId
> > metric?
> > >
> > > Thanks,
> > > Dong
> > >
> > >
> > > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > >> Hi, Ismael,
> > >>
> > >> For #3, typically, an admin won't configure more io threads than CPU
> > >> cores,
> > >> but it's possible for an admin to start with fewer io threads than
> cores
> > >> and grow that later on.
> > >>
> > >> Hi, Dong,
> > >>
> > >> I think the throttleTime sensor on the broker tells the admin whether
> a
> > >> user/clentId is throttled or not.
> > >>
> > >> Hi, Radi,
> > >>
> > >> The reasoning for delaying the throttled requests on the broker
> instead
> > of
> > >> returning an error immediately is that the latter has no way to
> prevent
> > >> the
> > >> client from retrying immediately, which will make things worse. The
> > >> delaying logic is based off a delay queue. A separate expiration
> thread
> > >> just waits on the next to be expired request. So, it doesn't tie up a
> > >> request handler thread.
> > >>
> > >> Thanks,
> > >>
> > >> Jun
> > >>
> > >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <is...@juma.me.uk>
> wrote:
> > >>
> > >> > Hi Jay,
> > >> >
> > >> > Regarding 1, I definitely like the simplicity of keeping a single
> > >> throttle
> > >> > time field in the response. The downside is that the client metrics
> > >> will be
> > >> > more coarse grained.
> > >> >
> > >> > Regarding 3, we have `leader.imbalance.per.broker.percentage` and
> > >> > `log.cleaner.min.cleanable.ratio`.
> > >> >
> > >> > Ismael
> > >> >
> > >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io>
> wrote:
> > >> >
> > >> > > A few minor comments:
> > >> > >
> > >> > >    1. Isn't it the case that the throttling time response field
> > should
> > >> > have
> > >> > >    the total time your request was throttled irrespective of the
> > >> quotas
> > >> > > that
> > >> > >    caused that. Limiting it to byte rate quota doesn't make sense,
> > >> but I
> > >> > > also
> > >> > >    I don't think we want to end up adding new fields in the
> response
> > >> for
> > >> > > every
> > >> > >    single thing we quota, right?
> > >> > >    2. I don't think we should make this quota specifically about
> io
> > >> > >    threads. Once we introduce these quotas people set them and
> > expect
> > >> > them
> > >> > > to
> > >> > >    be enforced (and if they aren't it may cause an outage). As a
> > >> result
> > >> > > they
> > >> > >    are a bit more sensitive than normal configs, I think. The
> > current
> > >> > > thread
> > >> > >    pools seem like something of an implementation detail and not
> the
> > >> > level
> > >> > > the
> > >> > >    user-facing quotas should be involved with. I think it might be
> > >> better
> > >> > > to
> > >> > >    make this a general request-time throttle with no mention in
> the
> > >> > naming
> > >> > >    about I/O threads and simply acknowledge the current limitation
> > >> (which
> > >> > > we
> > >> > >    may someday fix) in the docs that this covers only the time
> after
> > >> the
> > >> > >    thread is read off the network.
> > >> > >    3. As such I think the right interface to the user would be
> > >> something
> > >> > >    like percent_request_time and be in {0,...100} or
> > >> request_time_ratio
> > >> > > and be
> > >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we used if
> > the
> > >> > > scale
> > >> > >    is between 0 and 1 in the other metrics, right?)
> > >> > >
> > >> > > -Jay
> > >> > >
> > >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> > >> rajinisivaram@gmail.com
> > >> > >
> > >> > > wrote:
> > >> > >
> > >> > > > Guozhang/Dong,
> > >> > > >
> > >> > > > Thank you for the feedback.
> > >> > > >
> > >> > > > Guozhang : I have updated the section on co-existence of byte
> rate
> > >> and
> > >> > > > request time quotas.
> > >> > > >
> > >> > > > Dong: I hadn't added much detail to the metrics and sensors
> since
> > >> they
> > >> > > are
> > >> > > > going to be very similar to the existing metrics and sensors. To
> > >> avoid
> > >> > > > confusion, I have now added more detail. All metrics are in the
> > >> group
> > >> > > > "quotaType" and all sensors have names starting with "quotaType"
> > >> (where
> > >> > > > quotaType is Produce/Fetch/LeaderReplication/
> > >> > > > FollowerReplication/*IOThread*).
> > >> > > > So there will be no reuse of existing metrics/sensors. The new
> > ones
> > >> for
> > >> > > > request processing time based throttling will be completely
> > >> independent
> > >> > > of
> > >> > > > existing metrics/sensors, but will be consistent in format.
> > >> > > >
> > >> > > > The existing throttle_time_ms field in produce/fetch responses
> > will
> > >> not
> > >> > > be
> > >> > > > impacted by this KIP. That will continue to return byte-rate
> based
> > >> > > > throttling times. In addition, a new field
> > request_throttle_time_ms
> > >> > will
> > >> > > be
> > >> > > > added to return request quota based throttling times. These will
> > be
> > >> > > exposed
> > >> > > > as new metrics on the client-side.
> > >> > > >
> > >> > > > Since all metrics and sensors are different for each type of
> > quota,
> > >> I
> > >> > > > believe there is already sufficient metrics to monitor
> throttling
> > on
> > >> > both
> > >> > > > client and broker side for each type of throttling.
> > >> > > >
> > >> > > > Regards,
> > >> > > >
> > >> > > > Rajini
> > >> > > >
> > >> > > >
> > >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <li...@gmail.com>
> > >> wrote:
> > >> > > >
> > >> > > > > Hey Rajini,
> > >> > > > >
> > >> > > > > I think it makes a lot of sense to use io_thread_units as
> metric
> > >> to
> > >> > > quota
> > >> > > > > user's traffic here. LGTM overall. I have some questions
> > regarding
> > >> > > > sensors.
> > >> > > > >
> > >> > > > > - Can you be more specific in the KIP what sensors will be
> > added?
> > >> For
> > >> > > > > example, it will be useful to specify the name and attributes
> of
> > >> > these
> > >> > > > new
> > >> > > > > sensors.
> > >> > > > >
> > >> > > > > - We currently have throttle-time and queue-size for byte-rate
> > >> based
> > >> > > > quota.
> > >> > > > > Are you going to have separate throttle-time and queue-size
> for
> > >> > > requests
> > >> > > > > throttled by io_thread_unit-based quota, or will they share
> the
> > >> same
> > >> > > > > sensor?
> > >> > > > >
> > >> > > > > - Does the throttle-time in the ProduceResponse and
> > FetchResponse
> > >> > > > contains
> > >> > > > > time due to io_thread_unit-based quota?
> > >> > > > >
> > >> > > > > - Currently kafka server doesn't not provide any log or
> metrics
> > >> that
> > >> > > > tells
> > >> > > > > whether any given clientId (or user) is throttled. This is not
> > too
> > >> > bad
> > >> > > > > because we can still check the client-side byte-rate metric to
> > >> > validate
> > >> > > > > whether a given client is throttled. But with this
> > io_thread_unit,
> > >> > > there
> > >> > > > > will be no way to validate whether a given client is slow
> > because
> > >> it
> > >> > > has
> > >> > > > > exceeded its io_thread_unit limit. It is necessary for user to
> > be
> > >> > able
> > >> > > to
> > >> > > > > know this information to figure how whether they have reached
> > >> there
> > >> > > quota
> > >> > > > > limit. How about we add log4j log on the server side to
> > >> periodically
> > >> > > > print
> > >> > > > > the (client_id, byte-rate-throttle-time,
> > >> > io-thread-unit-throttle-time)
> > >> > > so
> > >> > > > > that kafka administrator can figure those users that have
> > reached
> > >> > their
> > >> > > > > limit and act accordingly?
> > >> > > > >
> > >> > > > > Thanks,
> > >> > > > > Dong
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > >
> > >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> > >> wangguoz@gmail.com>
> > >> > > > wrote:
> > >> > > > >
> > >> > > > > > Made a pass over the doc, overall LGTM except a minor
> comment
> > on
> > >> > the
> > >> > > > > > throttling implementation:
> > >> > > > > >
> > >> > > > > > Stated as "Request processing time throttling will be
> applied
> > on
> > >> > top
> > >> > > if
> > >> > > > > > necessary." I thought that it meant the request processing
> > time
> > >> > > > > throttling
> > >> > > > > > is applied first, but continue reading I found it actually
> > >> meant to
> > >> > > > apply
> > >> > > > > > produce / fetch byte rate throttling first.
> > >> > > > > >
> > >> > > > > > Also the last sentence "The remaining delay if any is
> applied
> > to
> > >> > the
> > >> > > > > > response." is a bit confusing to me. Maybe rewording it a
> bit?
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > Guozhang
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <ju...@confluent.io>
> > >> wrote:
> > >> > > > > >
> > >> > > > > > > Hi, Rajini,
> > >> > > > > > >
> > >> > > > > > > Thanks for the updated KIP. The latest proposal looks good
> > to
> > >> me.
> > >> > > > > > >
> > >> > > > > > > Jun
> > >> > > > > > >
> > >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > >> > > > > rajinisivaram@gmail.com
> > >> > > > > > >
> > >> > > > > > > wrote:
> > >> > > > > > >
> > >> > > > > > > > Jun/Roger,
> > >> > > > > > > >
> > >> > > > > > > > Thank you for the feedback.
> > >> > > > > > > >
> > >> > > > > > > > 1. I have updated the KIP to use absolute units instead
> of
> > >> > > > > percentage.
> > >> > > > > > > The
> > >> > > > > > > > property is called* io_thread_units* to align with the
> > >> thread
> > >> > > count
> > >> > > > > > > > property *num.io.threads*. When we implement network
> > thread
> > >> > > > > utilization
> > >> > > > > > > > quotas, we can add another property
> > *network_thread_units.*
> > >> > > > > > > >
> > >> > > > > > > > 2. ControlledShutdown is already listed under the exempt
> > >> > > requests.
> > >> > > > > Jun,
> > >> > > > > > > did
> > >> > > > > > > > you mean a different request that needs to be added? The
> > >> four
> > >> > > > > requests
> > >> > > > > > > > currently exempt in the KIP are StopReplica,
> > >> > ControlledShutdown,
> > >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are controlled
> > using
> > >> > > > > > ClusterAction
> > >> > > > > > > > ACL, so it is easy to exclude and only throttle if
> > >> > unauthorized.
> > >> > > I
> > >> > > > > > wasn't
> > >> > > > > > > > sure if there are other requests used only for
> > inter-broker
> > >> > that
> > >> > > > > needed
> > >> > > > > > > to
> > >> > > > > > > > be excluded.
> > >> > > > > > > >
> > >> > > > > > > > 3. I was thinking the smallest change would be to
> replace
> > >> all
> > >> > > > > > references
> > >> > > > > > > to
> > >> > > > > > > > *requestChannel.sendResponse()* with a local method
> > >> > > > > > > > *sendResponseMaybeThrottle()* that does the throttling
> if
> > >> any
> > >> > > plus
> > >> > > > > send
> > >> > > > > > > > response. If we throttle first in *KafkaApis.handle()*,
> > the
> > >> > time
> > >> > > > > spent
> > >> > > > > > > > within the method handling the request will not be
> > recorded
> > >> or
> > >> > > used
> > >> > > > > in
> > >> > > > > > > > throttling. We can look into this again when the PR is
> > ready
> > >> > for
> > >> > > > > > review.
> > >> > > > > > > >
> > >> > > > > > > > Regards,
> > >> > > > > > > >
> > >> > > > > > > > Rajini
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > >
> > >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > >> > > > > roger.hoover@gmail.com>
> > >> > > > > > > > wrote:
> > >> > > > > > > >
> > >> > > > > > > > > Great to see this KIP and the excellent discussion.
> > >> > > > > > > > >
> > >> > > > > > > > > To me, Jun's suggestion makes sense.  If my
> application
> > is
> > >> > > > > allocated
> > >> > > > > > 1
> > >> > > > > > > > > request handler unit, then it's as if I have a Kafka
> > >> broker
> > >> > > with
> > >> > > > a
> > >> > > > > > > single
> > >> > > > > > > > > request handler thread dedicated to me.  That's the
> > most I
> > >> > can
> > >> > > > use,
> > >> > > > > > at
> > >> > > > > > > > > least.  That allocation doesn't change even if an
> admin
> > >> later
> > >> > > > > > increases
> > >> > > > > > > > the
> > >> > > > > > > > > size of the request thread pool on the broker.  It's
> > >> similar
> > >> > to
> > >> > > > the
> > >> > > > > > CPU
> > >> > > > > > > > > abstraction that VMs and containers get from
> hypervisors
> > >> or
> > >> > OS
> > >> > > > > > > > schedulers.
> > >> > > > > > > > > While different client access patterns can use wildly
> > >> > different
> > >> > > > > > amounts
> > >> > > > > > > > of
> > >> > > > > > > > > request thread resources per request, a given
> > application
> > >> > will
> > >> > > > > > > generally
> > >> > > > > > > > > have a stable access pattern and can figure out
> > >> empirically
> > >> > how
> > >> > > > > many
> > >> > > > > > > > > "request thread units" it needs to meet it's
> > >> > throughput/latency
> > >> > > > > > goals.
> > >> > > > > > > > >
> > >> > > > > > > > > Cheers,
> > >> > > > > > > > >
> > >> > > > > > > > > Roger
> > >> > > > > > > > >
> > >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> > >> jun@confluent.io>
> > >> > > > wrote:
> > >> > > > > > > > >
> > >> > > > > > > > > > Hi, Rajini,
> > >> > > > > > > > > >
> > >> > > > > > > > > > Thanks for the updated KIP. A few more comments.
> > >> > > > > > > > > >
> > >> > > > > > > > > > 1. A concern of request_time_percent is that it's
> not
> > an
> > >> > > > absolute
> > >> > > > > > > > value.
> > >> > > > > > > > > > Let's say you give a user a 10% limit. If the admin
> > >> doubles
> > >> > > the
> > >> > > > > > > number
> > >> > > > > > > > of
> > >> > > > > > > > > > request handler threads, that user now actually has
> > >> twice
> > >> > the
> > >> > > > > > > absolute
> > >> > > > > > > > > > capacity. This may confuse people a bit. So, perhaps
> > >> > setting
> > >> > > > the
> > >> > > > > > > quota
> > >> > > > > > > > > > based on an absolute request thread unit is better.
> > >> > > > > > > > > >
> > >> > > > > > > > > > 2. ControlledShutdownRequest is also an inter-broker
> > >> > request
> > >> > > > and
> > >> > > > > > > needs
> > >> > > > > > > > to
> > >> > > > > > > > > > be excluded from throttling.
> > >> > > > > > > > > >
> > >> > > > > > > > > > 3. Implementation wise, I am wondering if it's
> simpler
> > >> to
> > >> > > apply
> > >> > > > > the
> > >> > > > > > > > > request
> > >> > > > > > > > > > time throttling first in KafkaApis.handle().
> > Otherwise,
> > >> we
> > >> > > will
> > >> > > > > > need
> > >> > > > > > > to
> > >> > > > > > > > > add
> > >> > > > > > > > > > the throttling logic in each type of request.
> > >> > > > > > > > > >
> > >> > > > > > > > > > Thanks,
> > >> > > > > > > > > >
> > >> > > > > > > > > > Jun
> > >> > > > > > > > > >
> > >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> > >> > > > > > > > rajinisivaram@gmail.com
> > >> > > > > > > > > >
> > >> > > > > > > > > > wrote:
> > >> > > > > > > > > >
> > >> > > > > > > > > > > Jun,
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > Thank you for the review.
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > I have reverted to the original KIP that throttles
> > >> based
> > >> > on
> > >> > > > > > request
> > >> > > > > > > > > > handler
> > >> > > > > > > > > > > utilization. At the moment, it uses percentage,
> but
> > I
> > >> am
> > >> > > > happy
> > >> > > > > to
> > >> > > > > > > > > change
> > >> > > > > > > > > > to
> > >> > > > > > > > > > > a fraction (out of 1 instead of 100) if required.
> I
> > >> have
> > >> > > > added
> > >> > > > > > the
> > >> > > > > > > > > > examples
> > >> > > > > > > > > > > from this discussion to the KIP. Also added a
> > "Future
> > >> > Work"
> > >> > > > > > section
> > >> > > > > > > > to
> > >> > > > > > > > > > > address network thread utilization. The
> > configuration
> > >> is
> > >> > > > named
> > >> > > > > > > > > > > "request_time_percent" with the expectation that
> it
> > >> can
> > >> > > also
> > >> > > > be
> > >> > > > > > > used
> > >> > > > > > > > as
> > >> > > > > > > > > > the
> > >> > > > > > > > > > > limit for network thread utilization when that is
> > >> > > > implemented,
> > >> > > > > so
> > >> > > > > > > > that
> > >> > > > > > > > > > > users have to set only one config for the two and
> > not
> > >> > have
> > >> > > to
> > >> > > > > > worry
> > >> > > > > > > > > about
> > >> > > > > > > > > > > the internal distribution of the work between the
> > two
> > >> > > thread
> > >> > > > > > pools
> > >> > > > > > > in
> > >> > > > > > > > > > > Kafka.
> > >> > > > > > > > > > >
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > Regards,
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > Rajini
> > >> > > > > > > > > > >
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
> > >> > > jun@confluent.io>
> > >> > > > > > > wrote:
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > > Hi, Rajini,
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > Thanks for the proposal.
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > The benefit of using the request processing time
> > >> over
> > >> > the
> > >> > > > > > request
> > >> > > > > > > > > rate
> > >> > > > > > > > > > is
> > >> > > > > > > > > > > > exactly what people have said. I will just
> expand
> > >> that
> > >> > a
> > >> > > > bit.
> > >> > > > > > > > > Consider
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > following case. The producer sends a produce
> > request
> > >> > > with a
> > >> > > > > > 10MB
> > >> > > > > > > > > > message
> > >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> > >> decompression of
> > >> > > the
> > >> > > > > > > message
> > >> > > > > > > > > on
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > broker could take 10-15 seconds, during which
> > time,
> > >> a
> > >> > > > request
> > >> > > > > > > > handler
> > >> > > > > > > > > > > > thread is completely blocked. In this case,
> > neither
> > >> the
> > >> > > > > byte-in
> > >> > > > > > > > quota
> > >> > > > > > > > > > nor
> > >> > > > > > > > > > > > the request rate quota may be effective in
> > >> protecting
> > >> > the
> > >> > > > > > broker.
> > >> > > > > > > > > > > Consider
> > >> > > > > > > > > > > > another case. A consumer group starts with 10
> > >> instances
> > >> > > and
> > >> > > > > > later
> > >> > > > > > > > on
> > >> > > > > > > > > > > > switches to 20 instances. The request rate will
> > >> likely
> > >> > > > > double,
> > >> > > > > > > but
> > >> > > > > > > > > the
> > >> > > > > > > > > > > > actually load on the broker may not double since
> > >> each
> > >> > > fetch
> > >> > > > > > > request
> > >> > > > > > > > > > only
> > >> > > > > > > > > > > > contains half of the partitions. Request rate
> > quota
> > >> may
> > >> > > not
> > >> > > > > be
> > >> > > > > > > easy
> > >> > > > > > > > > to
> > >> > > > > > > > > > > > configure in this case.
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > What we really want is to be able to prevent a
> > >> client
> > >> > > from
> > >> > > > > > using
> > >> > > > > > > > too
> > >> > > > > > > > > > much
> > >> > > > > > > > > > > > of the server side resources. In this particular
> > >> KIP,
> > >> > > this
> > >> > > > > > > resource
> > >> > > > > > > > > is
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > capacity of the request handler threads. I agree
> > >> that
> > >> > it
> > >> > > > may
> > >> > > > > > not
> > >> > > > > > > be
> > >> > > > > > > > > > > > intuitive for the users to determine how to set
> > the
> > >> > right
> > >> > > > > > limit.
> > >> > > > > > > > > > However,
> > >> > > > > > > > > > > > this is not completely new and has been done in
> > the
> > >> > > > container
> > >> > > > > > > world
> > >> > > > > > > > > > > > already. For example, Linux cgroup (
> > >> > > > > https://access.redhat.com/
> > >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> > >> terprise_Linux/6/html/
> > >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html) has the
> > >> > concept
> > >> > > of
> > >> > > > > > > > > > > > cpu.cfs_quota_us,
> > >> > > > > > > > > > > > which specifies the total amount of time in
> > >> > microseconds
> > >> > > > for
> > >> > > > > > > which
> > >> > > > > > > > > all
> > >> > > > > > > > > > > > tasks in a cgroup can run during a one second
> > >> period.
> > >> > We
> > >> > > > can
> > >> > > > > > > > > > potentially
> > >> > > > > > > > > > > > model the request handler threads in a similar
> > way.
> > >> For
> > >> > > > > > example,
> > >> > > > > > > > each
> > >> > > > > > > > > > > > request handler thread can be 1 request handler
> > unit
> > >> > and
> > >> > > > the
> > >> > > > > > > admin
> > >> > > > > > > > > can
> > >> > > > > > > > > > > > configure a limit on how many units (say 0.01) a
> > >> client
> > >> > > can
> > >> > > > > > have.
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > Regarding not throttling the internal broker to
> > >> broker
> > >> > > > > > requests.
> > >> > > > > > > We
> > >> > > > > > > > > > could
> > >> > > > > > > > > > > > do that. Alternatively, we could just let the
> > admin
> > >> > > > > configure a
> > >> > > > > > > > high
> > >> > > > > > > > > > > limit
> > >> > > > > > > > > > > > for the kafka user (it may not be able to do
> that
> > >> > easily
> > >> > > > > based
> > >> > > > > > on
> > >> > > > > > > > > > > clientId
> > >> > > > > > > > > > > > though).
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > Ideally we want to be able to protect the
> > >> utilization
> > >> > of
> > >> > > > the
> > >> > > > > > > > network
> > >> > > > > > > > > > > thread
> > >> > > > > > > > > > > > pool too. The difficult is mostly what Rajini
> > said:
> > >> (1)
> > >> > > The
> > >> > > > > > > > mechanism
> > >> > > > > > > > > > for
> > >> > > > > > > > > > > > throttling the requests is through Purgatory and
> > we
> > >> > will
> > >> > > > have
> > >> > > > > > to
> > >> > > > > > > > > think
> > >> > > > > > > > > > > > through how to integrate that into the network
> > >> layer.
> > >> > > (2)
> > >> > > > In
> > >> > > > > > the
> > >> > > > > > > > > > network
> > >> > > > > > > > > > > > layer, currently we know the user, but not the
> > >> clientId
> > >> > > of
> > >> > > > > the
> > >> > > > > > > > > request.
> > >> > > > > > > > > > > So,
> > >> > > > > > > > > > > > it's a bit tricky to throttle based on clientId
> > >> there.
> > >> > > > Plus,
> > >> > > > > > the
> > >> > > > > > > > > > byteOut
> > >> > > > > > > > > > > > quota can already protect the network thread
> > >> > utilization
> > >> > > > for
> > >> > > > > > > fetch
> > >> > > > > > > > > > > > requests. So, if we can't figure out this part
> > right
> > >> > now,
> > >> > > > > just
> > >> > > > > > > > > focusing
> > >> > > > > > > > > > > on
> > >> > > > > > > > > > > > the request handling threads for this KIP is
> > still a
> > >> > > useful
> > >> > > > > > > > feature.
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > Thanks,
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > Jun
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram
> <
> > >> > > > > > > > > > rajinisivaram@gmail.com
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > > > > Thank you all for the feedback.
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > Jay: I have removed exemption for consumer
> > >> heartbeat
> > >> > > etc.
> > >> > > > > > Agree
> > >> > > > > > > > > that
> > >> > > > > > > > > > > > > protecting the cluster is more important than
> > >> > > protecting
> > >> > > > > > > > individual
> > >> > > > > > > > > > > apps.
> > >> > > > > > > > > > > > > Have retained the exemption for
> > >> > > StopReplicat/LeaderAndIsr
> > >> > > > > > etc,
> > >> > > > > > > > > these
> > >> > > > > > > > > > > are
> > >> > > > > > > > > > > > > throttled only if authorization fails (so
> can't
> > be
> > >> > used
> > >> > > > for
> > >> > > > > > DoS
> > >> > > > > > > > > > attacks
> > >> > > > > > > > > > > > in
> > >> > > > > > > > > > > > > a secure cluster, but allows inter-broker
> > >> requests to
> > >> > > > > > complete
> > >> > > > > > > > > > without
> > >> > > > > > > > > > > > > delays).
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > I will wait another day to see if these is any
> > >> > > objection
> > >> > > > to
> > >> > > > > > > > quotas
> > >> > > > > > > > > > > based
> > >> > > > > > > > > > > > on
> > >> > > > > > > > > > > > > request processing time (as opposed to request
> > >> rate)
> > >> > > and
> > >> > > > if
> > >> > > > > > > there
> > >> > > > > > > > > are
> > >> > > > > > > > > > > no
> > >> > > > > > > > > > > > > objections, I will revert to the original
> > proposal
> > >> > with
> > >> > > > > some
> > >> > > > > > > > > changes.
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > The original proposal was only including the
> > time
> > >> > used
> > >> > > by
> > >> > > > > the
> > >> > > > > > > > > request
> > >> > > > > > > > > > > > > handler threads (that made calculation easy).
> I
> > >> think
> > >> > > the
> > >> > > > > > > > > suggestion
> > >> > > > > > > > > > is
> > >> > > > > > > > > > > > to
> > >> > > > > > > > > > > > > include the time spent in the network threads
> as
> > >> well
> > >> > > > since
> > >> > > > > > > that
> > >> > > > > > > > > may
> > >> > > > > > > > > > be
> > >> > > > > > > > > > > > > significant. As Jay pointed out, it is more
> > >> > complicated
> > >> > > > to
> > >> > > > > > > > > calculate
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > > total available CPU time and convert to a
> ratio
> > >> when
> > >> > > > there
> > >> > > > > > *m*
> > >> > > > > > > > I/O
> > >> > > > > > > > > > > > threads
> > >> > > > > > > > > > > > > and *n* network threads.
> > >> > ThreadMXBean#getThreadCPUTime(
> > >> > > )
> > >> > > > > may
> > >> > > > > > > > give
> > >> > > > > > > > > us
> > >> > > > > > > > > > > > what
> > >> > > > > > > > > > > > > we want, but it can be very expensive on some
> > >> > > platforms.
> > >> > > > As
> > >> > > > > > > > Becket
> > >> > > > > > > > > > and
> > >> > > > > > > > > > > > > Guozhang have pointed out, we do have several
> > time
> > >> > > > > > measurements
> > >> > > > > > > > > > already
> > >> > > > > > > > > > > > for
> > >> > > > > > > > > > > > > generating metrics that we could use, though
> we
> > >> might
> > >> > > > want
> > >> > > > > to
> > >> > > > > > > > > switch
> > >> > > > > > > > > > to
> > >> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis()
> since
> > >> some
> > >> > of
> > >> > > > the
> > >> > > > > > > > values
> > >> > > > > > > > > > for
> > >> > > > > > > > > > > > > small requests may be < 1ms. But rather than
> add
> > >> up
> > >> > the
> > >> > > > > time
> > >> > > > > > > > spent
> > >> > > > > > > > > in
> > >> > > > > > > > > > > I/O
> > >> > > > > > > > > > > > > thread and network thread, wouldn't it be
> better
> > >> to
> > >> > > > convert
> > >> > > > > > the
> > >> > > > > > > > > time
> > >> > > > > > > > > > > > spent
> > >> > > > > > > > > > > > > on each thread into a separate ratio? UserA
> has
> > a
> > >> > > request
> > >> > > > > > quota
> > >> > > > > > > > of
> > >> > > > > > > > > > 5%.
> > >> > > > > > > > > > > > Can
> > >> > > > > > > > > > > > > we take that to mean that UserA can use 5% of
> > the
> > >> > time
> > >> > > on
> > >> > > > > > > network
> > >> > > > > > > > > > > threads
> > >> > > > > > > > > > > > > and 5% of the time on I/O threads? If either
> is
> > >> > > exceeded,
> > >> > > > > the
> > >> > > > > > > > > > response
> > >> > > > > > > > > > > is
> > >> > > > > > > > > > > > > throttled - it would mean maintaining two sets
> > of
> > >> > > metrics
> > >> > > > > for
> > >> > > > > > > the
> > >> > > > > > > > > two
> > >> > > > > > > > > > > > > durations, but would result in more meaningful
> > >> > ratios.
> > >> > > We
> > >> > > > > > could
> > >> > > > > > > > > > define
> > >> > > > > > > > > > > > two
> > >> > > > > > > > > > > > > quota limits (UserA has 5% of request threads
> > and
> > >> 10%
> > >> > > of
> > >> > > > > > > network
> > >> > > > > > > > > > > > threads),
> > >> > > > > > > > > > > > > but that seems unnecessary and harder to
> explain
> > >> to
> > >> > > > users.
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > Back to why and how quotas are applied to
> > network
> > >> > > thread
> > >> > > > > > > > > utilization:
> > >> > > > > > > > > > > > > a) In the case of fetch,  the time spent in
> the
> > >> > network
> > >> > > > > > thread
> > >> > > > > > > > may
> > >> > > > > > > > > be
> > >> > > > > > > > > > > > > significant and I can see the need to include
> > >> this.
> > >> > Are
> > >> > > > > there
> > >> > > > > > > > other
> > >> > > > > > > > > > > > > requests where the network thread utilization
> is
> > >> > > > > significant?
> > >> > > > > > > In
> > >> > > > > > > > > the
> > >> > > > > > > > > > > case
> > >> > > > > > > > > > > > > of fetch, request handler thread utilization
> > would
> > >> > > > throttle
> > >> > > > > > > > clients
> > >> > > > > > > > > > > with
> > >> > > > > > > > > > > > > high request rate, low data volume and fetch
> > byte
> > >> > rate
> > >> > > > > quota
> > >> > > > > > > will
> > >> > > > > > > > > > > > throttle
> > >> > > > > > > > > > > > > clients with high data volume. Network thread
> > >> > > utilization
> > >> > > > > is
> > >> > > > > > > > > perhaps
> > >> > > > > > > > > > > > > proportional to the data volume. I am
> wondering
> > >> if we
> > >> > > > even
> > >> > > > > > need
> > >> > > > > > > > to
> > >> > > > > > > > > > > > throttle
> > >> > > > > > > > > > > > > based on network thread utilization or whether
> > the
> > >> > data
> > >> > > > > > volume
> > >> > > > > > > > > quota
> > >> > > > > > > > > > > > covers
> > >> > > > > > > > > > > > > this case.
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > b) At the moment, we record and check for
> quota
> > >> > > violation
> > >> > > > > at
> > >> > > > > > > the
> > >> > > > > > > > > same
> > >> > > > > > > > > > > > time.
> > >> > > > > > > > > > > > > If a quota is violated, the response is
> delayed.
> > >> > Using
> > >> > > > > Jay'e
> > >> > > > > > > > > example
> > >> > > > > > > > > > of
> > >> > > > > > > > > > > > > disk reads for fetches happening in the
> network
> > >> > thread,
> > >> > > > We
> > >> > > > > > > can't
> > >> > > > > > > > > > record
> > >> > > > > > > > > > > > and
> > >> > > > > > > > > > > > > delay a response after the disk reads. We
> could
> > >> > record
> > >> > > > the
> > >> > > > > > time
> > >> > > > > > > > > spent
> > >> > > > > > > > > > > on
> > >> > > > > > > > > > > > > the network thread when the response is
> complete
> > >> and
> > >> > > > > > introduce
> > >> > > > > > > a
> > >> > > > > > > > > > delay
> > >> > > > > > > > > > > > for
> > >> > > > > > > > > > > > > handling a subsequent request (separate out
> > >> recording
> > >> > > and
> > >> > > > > > quota
> > >> > > > > > > > > > > violation
> > >> > > > > > > > > > > > > handling in the case of network thread
> > overload).
> > >> > Does
> > >> > > > that
> > >> > > > > > > make
> > >> > > > > > > > > > sense?
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > Regards,
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > Rajini
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> > >> > > > > > > > becket.qin@gmail.com>
> > >> > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > Hey Jay,
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU time
> is a
> > >> > little
> > >> > > > > > > tricky. I
> > >> > > > > > > > > am
> > >> > > > > > > > > > > > > thinking
> > >> > > > > > > > > > > > > > that maybe we can use the existing request
> > >> > > statistics.
> > >> > > > > They
> > >> > > > > > > are
> > >> > > > > > > > > > > already
> > >> > > > > > > > > > > > > > very detailed so we can probably see the
> > >> > approximate
> > >> > > > CPU
> > >> > > > > > time
> > >> > > > > > > > > from
> > >> > > > > > > > > > > it,
> > >> > > > > > > > > > > > > e.g.
> > >> > > > > > > > > > > > > > something like (total_time -
> > >> > > > request/response_queue_time
> > >> > > > > -
> > >> > > > > > > > > > > > remote_time).
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > I agree with Guozhang that when a user is
> > >> throttled
> > >> > > it
> > >> > > > is
> > >> > > > > > > > likely
> > >> > > > > > > > > > that
> > >> > > > > > > > > > > > we
> > >> > > > > > > > > > > > > > need to see if anything has went wrong
> first,
> > >> and
> > >> > if
> > >> > > > the
> > >> > > > > > > users
> > >> > > > > > > > > are
> > >> > > > > > > > > > > well
> > >> > > > > > > > > > > > > > behaving and just need more resources, we
> will
> > >> have
> > >> > > to
> > >> > > > > bump
> > >> > > > > > > up
> > >> > > > > > > > > the
> > >> > > > > > > > > > > > quota
> > >> > > > > > > > > > > > > > for them. It is true that pre-allocating CPU
> > >> time
> > >> > > quota
> > >> > > > > > > > precisely
> > >> > > > > > > > > > for
> > >> > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > users is difficult. So in practice it would
> > >> > probably
> > >> > > be
> > >> > > > > > more
> > >> > > > > > > > like
> > >> > > > > > > > > > > first
> > >> > > > > > > > > > > > > set
> > >> > > > > > > > > > > > > > a relative high protective CPU time quota
> for
> > >> > > everyone
> > >> > > > > and
> > >> > > > > > > > > increase
> > >> > > > > > > > > > > > that
> > >> > > > > > > > > > > > > > for some individual clients on demand.
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > Thanks,
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang
> > Wang <
> > >> > > > > > > > > wangguoz@gmail.com
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > This is a great proposal, glad to see it
> > >> > happening.
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > I am inclined to the CPU throttling, or
> more
> > >> > > > > specifically
> > >> > > > > > > > > > > processing
> > >> > > > > > > > > > > > > time
> > >> > > > > > > > > > > > > > > ratio instead of the request rate
> throttling
> > >> as
> > >> > > well.
> > >> > > > > > > Becket
> > >> > > > > > > > > has
> > >> > > > > > > > > > > very
> > >> > > > > > > > > > > > > > well
> > >> > > > > > > > > > > > > > > summed my rationales above, and one thing
> to
> > >> add
> > >> > > here
> > >> > > > > is
> > >> > > > > > > that
> > >> > > > > > > > > the
> > >> > > > > > > > > > > > > former
> > >> > > > > > > > > > > > > > > has a good support for both "protecting
> > >> against
> > >> > > rogue
> > >> > > > > > > > clients"
> > >> > > > > > > > > as
> > >> > > > > > > > > > > > well
> > >> > > > > > > > > > > > > as
> > >> > > > > > > > > > > > > > > "utilizing a cluster for multi-tenancy
> > usage":
> > >> > when
> > >> > > > > > > thinking
> > >> > > > > > > > > > about
> > >> > > > > > > > > > > > how
> > >> > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > explain this to the end users, I find it
> > >> actually
> > >> > > > more
> > >> > > > > > > > natural
> > >> > > > > > > > > > than
> > >> > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > request rate since as mentioned above,
> > >> different
> > >> > > > > requests
> > >> > > > > > > > will
> > >> > > > > > > > > > have
> > >> > > > > > > > > > > > > quite
> > >> > > > > > > > > > > > > > > different "cost", and Kafka today already
> > have
> > >> > > > various
> > >> > > > > > > > request
> > >> > > > > > > > > > > types
> > >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata, etc),
> > >> because
> > >> > of
> > >> > > > that
> > >> > > > > > the
> > >> > > > > > > > > > request
> > >> > > > > > > > > > > > > rate
> > >> > > > > > > > > > > > > > > throttling may not be as effective unless
> it
> > >> is
> > >> > set
> > >> > > > > very
> > >> > > > > > > > > > > > > conservatively.
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > Regarding to user reactions when they are
> > >> > > throttled,
> > >> > > > I
> > >> > > > > > > think
> > >> > > > > > > > it
> > >> > > > > > > > > > may
> > >> > > > > > > > > > > > > > differ
> > >> > > > > > > > > > > > > > > case-by-case, and need to be discovered /
> > >> guided
> > >> > by
> > >> > > > > > looking
> > >> > > > > > > > at
> > >> > > > > > > > > > > > relative
> > >> > > > > > > > > > > > > > > metrics. So in other words users would not
> > >> expect
> > >> > > to
> > >> > > > > get
> > >> > > > > > > > > > additional
> > >> > > > > > > > > > > > > > > information by simply being told "hey, you
> > are
> > >> > > > > > throttled",
> > >> > > > > > > > > which
> > >> > > > > > > > > > is
> > >> > > > > > > > > > > > all
> > >> > > > > > > > > > > > > > > what throttling does; they need to take a
> > >> > follow-up
> > >> > > > > step
> > >> > > > > > > and
> > >> > > > > > > > > see
> > >> > > > > > > > > > > > "hmm,
> > >> > > > > > > > > > > > > > I'm
> > >> > > > > > > > > > > > > > > throttled probably because of ..", which
> is
> > by
> > >> > > > looking
> > >> > > > > at
> > >> > > > > > > > other
> > >> > > > > > > > > > > > metric
> > >> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding the
> > >> brokers
> > >> > > with
> > >> > > > > > > metadata
> > >> > > > > > > > > > > > request,
> > >> > > > > > > > > > > > > > > which are usually cheap to handle but I'm
> > >> sending
> > >> > > > > > thousands
> > >> > > > > > > > per
> > >> > > > > > > > > > > > second;
> > >> > > > > > > > > > > > > > or
> > >> > > > > > > > > > > > > > > is it because I'm catching up and hence
> > >> sending
> > >> > > very
> > >> > > > > > heavy
> > >> > > > > > > > > > fetching
> > >> > > > > > > > > > > > > > request
> > >> > > > > > > > > > > > > > > with large min.bytes, etc.
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > Regarding to the implementation, as once
> > >> > discussed
> > >> > > > with
> > >> > > > > > > Jun,
> > >> > > > > > > > > this
> > >> > > > > > > > > > > > seems
> > >> > > > > > > > > > > > > > not
> > >> > > > > > > > > > > > > > > very difficult since today we are already
> > >> > > collecting
> > >> > > > > the
> > >> > > > > > > > > "thread
> > >> > > > > > > > > > > pool
> > >> > > > > > > > > > > > > > > utilization" metrics, which is a single
> > >> > percentage
> > >> > > > > > > > > > > > "aggregateIdleMeter"
> > >> > > > > > > > > > > > > > > value; but we are already effectively
> > >> aggregating
> > >> > > it
> > >> > > > > for
> > >> > > > > > > each
> > >> > > > > > > > > > > > requests
> > >> > > > > > > > > > > > > in
> > >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just
> extend
> > >> it by
> > >> > > > > > recording
> > >> > > > > > > > the
> > >> > > > > > > > > > > > source
> > >> > > > > > > > > > > > > > > client id when handling them and
> aggregating
> > >> by
> > >> > > > > clientId
> > >> > > > > > as
> > >> > > > > > > > > well
> > >> > > > > > > > > > as
> > >> > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > total aggregate.
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > Guozhang
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay
> Kreps <
> > >> > > > > > > jay@confluent.io
> > >> > > > > > > > >
> > >> > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > When I thought about it more deeply I
> came
> > >> > around
> > >> > > > to
> > >> > > > > > the
> > >> > > > > > > > > > "percent
> > >> > > > > > > > > > > > of
> > >> > > > > > > > > > > > > > > > processing time" metric too. It seems a
> > lot
> > >> > > closer
> > >> > > > to
> > >> > > > > > the
> > >> > > > > > > > > thing
> > >> > > > > > > > > > > we
> > >> > > > > > > > > > > > > > > actually
> > >> > > > > > > > > > > > > > > > care about and need to protect. I also
> > think
> > >> > this
> > >> > > > > would
> > >> > > > > > > be
> > >> > > > > > > > a
> > >> > > > > > > > > > very
> > >> > > > > > > > > > > > > > useful
> > >> > > > > > > > > > > > > > > > metric even in the absence of throttling
> > >> just
> > >> > to
> > >> > > > > debug
> > >> > > > > > > > whose
> > >> > > > > > > > > > > using
> > >> > > > > > > > > > > > > > > > capacity.
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > Two problems to consider:
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > >    1. I agree that for the user it is
> > >> > > > understandable
> > >> > > > > > what
> > >> > > > > > > > > lead
> > >> > > > > > > > > > to
> > >> > > > > > > > > > > > > their
> > >> > > > > > > > > > > > > > > >    being throttled, but it is a bit hard
> > to
> > >> > > figure
> > >> > > > > out
> > >> > > > > > > the
> > >> > > > > > > > > safe
> > >> > > > > > > > > > > > range
> > >> > > > > > > > > > > > > > for
> > >> > > > > > > > > > > > > > > >    them. i.e. if I have a new app that
> > will
> > >> > send
> > >> > > > 200
> > >> > > > > > > > > > > messages/sec I
> > >> > > > > > > > > > > > > can
> > >> > > > > > > > > > > > > > > >    probably reason that I'll be under
> the
> > >> > > > throttling
> > >> > > > > > > limit
> > >> > > > > > > > of
> > >> > > > > > > > > > 300
> > >> > > > > > > > > > > > > > > req/sec.
> > >> > > > > > > > > > > > > > > >    However if I need to be under a 10%
> CPU
> > >> > > > resources
> > >> > > > > > > limit
> > >> > > > > > > > it
> > >> > > > > > > > > > may
> > >> > > > > > > > > > > > be
> > >> > > > > > > > > > > > > a
> > >> > > > > > > > > > > > > > > bit
> > >> > > > > > > > > > > > > > > >    harder for me to know a priori if i
> > will
> > >> or
> > >> > > > won't.
> > >> > > > > > > > > > > > > > > >    2. Calculating the available CPU time
> > is
> > >> a
> > >> > bit
> > >> > > > > > > difficult
> > >> > > > > > > > > > since
> > >> > > > > > > > > > > > > there
> > >> > > > > > > > > > > > > > > are
> > >> > > > > > > > > > > > > > > >    actually two thread pools--the I/O
> > >> threads
> > >> > and
> > >> > > > the
> > >> > > > > > > > network
> > >> > > > > > > > > > > > > threads.
> > >> > > > > > > > > > > > > > I
> > >> > > > > > > > > > > > > > > > think
> > >> > > > > > > > > > > > > > > >    it might be workable to count just
> the
> > >> I/O
> > >> > > > thread
> > >> > > > > > time
> > >> > > > > > > > as
> > >> > > > > > > > > in
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > > proposal,
> > >> > > > > > > > > > > > > > > >    but the network thread work is
> actually
> > >> > > > > non-trivial
> > >> > > > > > > > (e.g.
> > >> > > > > > > > > > all
> > >> > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > disk
> > >> > > > > > > > > > > > > > > >    reads for fetches happen in that
> > >> thread). If
> > >> > > you
> > >> > > > > > count
> > >> > > > > > > > > both
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > network
> > >> > > > > > > > > > > > > > > > and
> > >> > > > > > > > > > > > > > > >    I/O threads it can skew things a bit.
> > >> E.g.
> > >> > say
> > >> > > > you
> > >> > > > > > > have
> > >> > > > > > > > 50
> > >> > > > > > > > > > > > network
> > >> > > > > > > > > > > > > > > > threads,
> > >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores, what is
> > the
> > >> > > > available
> > >> > > > > > cpu
> > >> > > > > > > > > time
> > >> > > > > > > > > > > > > > available
> > >> > > > > > > > > > > > > > > > in a
> > >> > > > > > > > > > > > > > > >    second? I suppose this is a problem
> > >> whenever
> > >> > > you
> > >> > > > > > have
> > >> > > > > > > a
> > >> > > > > > > > > > > > bottleneck
> > >> > > > > > > > > > > > > > > > between
> > >> > > > > > > > > > > > > > > >    I/O and network threads or if you end
> > up
> > >> > > > > > significantly
> > >> > > > > > > > > > > > > > > over-provisioning
> > >> > > > > > > > > > > > > > > >    one pool (both of which are hard to
> > >> avoid).
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > An alternative for CPU throttling would
> be
> > >> to
> > >> > use
> > >> > > > > this
> > >> > > > > > > api:
> > >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> > >> > > > > > 1.5.0/docs/api/java/lang/
> > >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
> > >> > > > getThreadCpuTime(long)
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > That would let you track actual CPU
> usage
> > >> > across
> > >> > > > the
> > >> > > > > > > > network,
> > >> > > > > > > > > > I/O
> > >> > > > > > > > > > > > > > > threads,
> > >> > > > > > > > > > > > > > > > and purgatory threads and look at it as
> a
> > >> > > > percentage
> > >> > > > > of
> > >> > > > > > > > total
> > >> > > > > > > > > > > > cores.
> > >> > > > > > > > > > > > > I
> > >> > > > > > > > > > > > > > > > think this fixes many problems in the
> > >> > reliability
> > >> > > > of
> > >> > > > > > the
> > >> > > > > > > > > > metric.
> > >> > > > > > > > > > > > It's
> > >> > > > > > > > > > > > > > > > meaning is slightly different as it is
> > just
> > >> CPU
> > >> > > > (you
> > >> > > > > > > don't
> > >> > > > > > > > > get
> > >> > > > > > > > > > > > > charged
> > >> > > > > > > > > > > > > > > for
> > >> > > > > > > > > > > > > > > > time blocking on I/O) but that may be
> okay
> > >> > > because
> > >> > > > we
> > >> > > > > > > > already
> > >> > > > > > > > > > > have
> > >> > > > > > > > > > > > a
> > >> > > > > > > > > > > > > > > > throttle on I/O. The downside is I think
> > it
> > >> is
> > >> > > > > possible
> > >> > > > > > > > this
> > >> > > > > > > > > > api
> > >> > > > > > > > > > > > can
> > >> > > > > > > > > > > > > be
> > >> > > > > > > > > > > > > > > > disabled or isn't always available and
> it
> > >> may
> > >> > > also
> > >> > > > be
> > >> > > > > > > > > expensive
> > >> > > > > > > > > > > > (also
> > >> > > > > > > > > > > > > > > I've
> > >> > > > > > > > > > > > > > > > never used it so not sure if it really
> > works
> > >> > the
> > >> > > > way
> > >> > > > > i
> > >> > > > > > > > > think).
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > -Jay
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket
> > Qin
> > >> <
> > >> > > > > > > > > > > becket.qin@gmail.com>
> > >> > > > > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > If the purpose of the KIP is only to
> > >> protect
> > >> > > the
> > >> > > > > > > cluster
> > >> > > > > > > > > from
> > >> > > > > > > > > > > > being
> > >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and is
> not
> > >> > > intended
> > >> > > > to
> > >> > > > > > > > address
> > >> > > > > > > > > > > > > resource
> > >> > > > > > > > > > > > > > > > > allocation problem among the clients,
> I
> > am
> > >> > > > > wondering
> > >> > > > > > if
> > >> > > > > > > > > using
> > >> > > > > > > > > > > > > request
> > >> > > > > > > > > > > > > > > > > handling time quota (CPU time quota)
> is
> > a
> > >> > > better
> > >> > > > > > > option.
> > >> > > > > > > > > Here
> > >> > > > > > > > > > > are
> > >> > > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > > > reasons:
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > 1. request handling time quota has
> > better
> > >> > > > > protection.
> > >> > > > > > > Say
> > >> > > > > > > > > we
> > >> > > > > > > > > > > have
> > >> > > > > > > > > > > > > > > request
> > >> > > > > > > > > > > > > > > > > rate quota and set that to some value
> > like
> > >> > 100
> > >> > > > > > > > > requests/sec,
> > >> > > > > > > > > > it
> > >> > > > > > > > > > > > is
> > >> > > > > > > > > > > > > > > > possible
> > >> > > > > > > > > > > > > > > > > that some of the requests are very
> > >> expensive
> > >> > > > > actually
> > >> > > > > > > > take
> > >> > > > > > > > > a
> > >> > > > > > > > > > > lot
> > >> > > > > > > > > > > > of
> > >> > > > > > > > > > > > > > > time
> > >> > > > > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > > handle. In that case a few clients may
> > >> still
> > >> > > > > occupy a
> > >> > > > > > > lot
> > >> > > > > > > > > of
> > >> > > > > > > > > > > CPU
> > >> > > > > > > > > > > > > time
> > >> > > > > > > > > > > > > > > > even
> > >> > > > > > > > > > > > > > > > > the request rate is low. Arguably we
> can
> > >> > > > carefully
> > >> > > > > > set
> > >> > > > > > > > > > request
> > >> > > > > > > > > > > > rate
> > >> > > > > > > > > > > > > > > quota
> > >> > > > > > > > > > > > > > > > > for each request and client id
> > >> combination,
> > >> > but
> > >> > > > it
> > >> > > > > > > could
> > >> > > > > > > > > > still
> > >> > > > > > > > > > > be
> > >> > > > > > > > > > > > > > > tricky
> > >> > > > > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > > get it right for everyone.
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > If we use the request time handling
> > >> quota, we
> > >> > > can
> > >> > > > > > > simply
> > >> > > > > > > > > say
> > >> > > > > > > > > > no
> > >> > > > > > > > > > > > > > clients
> > >> > > > > > > > > > > > > > > > can
> > >> > > > > > > > > > > > > > > > > take up to more than 30% of the total
> > >> request
> > >> > > > > > handling
> > >> > > > > > > > > > capacity
> > >> > > > > > > > > > > > > > > (measured
> > >> > > > > > > > > > > > > > > > > by time), regardless of the difference
> > >> among
> > >> > > > > > different
> > >> > > > > > > > > > requests
> > >> > > > > > > > > > > > or
> > >> > > > > > > > > > > > > > what
> > >> > > > > > > > > > > > > > > > is
> > >> > > > > > > > > > > > > > > > > the client doing. In this case maybe
> we
> > >> can
> > >> > > quota
> > >> > > > > all
> > >> > > > > > > the
> > >> > > > > > > > > > > > requests
> > >> > > > > > > > > > > > > if
> > >> > > > > > > > > > > > > > > we
> > >> > > > > > > > > > > > > > > > > want to.
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > 2. The main benefit of using request
> > rate
> > >> > limit
> > >> > > > is
> > >> > > > > > that
> > >> > > > > > > > it
> > >> > > > > > > > > > > seems
> > >> > > > > > > > > > > > > more
> > >> > > > > > > > > > > > > > > > > intuitive. It is true that it is
> > probably
> > >> > > easier
> > >> > > > to
> > >> > > > > > > > explain
> > >> > > > > > > > > > to
> > >> > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > user
> > >> > > > > > > > > > > > > > > > > what does that mean. However, in
> > practice
> > >> it
> > >> > > > looks
> > >> > > > > > the
> > >> > > > > > > > > impact
> > >> > > > > > > > > > > of
> > >> > > > > > > > > > > > > > > request
> > >> > > > > > > > > > > > > > > > > rate quota is not more quantifiable
> than
> > >> the
> > >> > > > > request
> > >> > > > > > > > > handling
> > >> > > > > > > > > > > > time
> > >> > > > > > > > > > > > > > > quota.
> > >> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is
> still
> > >> > > difficult
> > >> > > > > to
> > >> > > > > > > > give a
> > >> > > > > > > > > > > > number
> > >> > > > > > > > > > > > > > > about
> > >> > > > > > > > > > > > > > > > > impact of throughput or latency when a
> > >> > request
> > >> > > > rate
> > >> > > > > > > quota
> > >> > > > > > > > > is
> > >> > > > > > > > > > > hit.
> > >> > > > > > > > > > > > > So
> > >> > > > > > > > > > > > > > it
> > >> > > > > > > > > > > > > > > > is
> > >> > > > > > > > > > > > > > > > > not better than the request handling
> > time
> > >> > > quota.
> > >> > > > In
> > >> > > > > > > fact
> > >> > > > > > > > I
> > >> > > > > > > > > > feel
> > >> > > > > > > > > > > > it
> > >> > > > > > > > > > > > > is
> > >> > > > > > > > > > > > > > > > > clearer to tell user that "you are
> > limited
> > >> > > > because
> > >> > > > > > you
> > >> > > > > > > > have
> > >> > > > > > > > > > > taken
> > >> > > > > > > > > > > > > 30%
> > >> > > > > > > > > > > > > > > of
> > >> > > > > > > > > > > > > > > > > the CPU time on the broker" than
> > otherwise
> > >> > > > > something
> > >> > > > > > > like
> > >> > > > > > > > > > "your
> > >> > > > > > > > > > > > > > request
> > >> > > > > > > > > > > > > > > > > rate quota on metadata request has
> > >> reached".
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > Thanks,
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay
> > >> Kreps <
> > >> > > > > > > > > jay@confluent.io
> > >> > > > > > > > > > >
> > >> > > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > I think this proposal makes a lot of
> > >> sense
> > >> > > > > > > (especially
> > >> > > > > > > > > now
> > >> > > > > > > > > > > that
> > >> > > > > > > > > > > > > it
> > >> > > > > > > > > > > > > > is
> > >> > > > > > > > > > > > > > > > > > oriented around request rate) and
> > fills
> > >> the
> > >> > > > > biggest
> > >> > > > > > > > > > remaining
> > >> > > > > > > > > > > > gap
> > >> > > > > > > > > > > > > > in
> > >> > > > > > > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > > > > multi-tenancy story.
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > I think for intra-cluster
> > communication
> > >> > > > > > (StopReplica,
> > >> > > > > > > > > etc)
> > >> > > > > > > > > > we
> > >> > > > > > > > > > > > > could
> > >> > > > > > > > > > > > > > > > avoid
> > >> > > > > > > > > > > > > > > > > > throttling entirely. You can secure
> or
> > >> > > > otherwise
> > >> > > > > > > > > lock-down
> > >> > > > > > > > > > > the
> > >> > > > > > > > > > > > > > > cluster
> > >> > > > > > > > > > > > > > > > > > communication to avoid any
> > unauthorized
> > >> > > > external
> > >> > > > > > > party
> > >> > > > > > > > > from
> > >> > > > > > > > > > > > > trying
> > >> > > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > > > initiate these requests. As a result
> > we
> > >> are
> > >> > > as
> > >> > > > > > likely
> > >> > > > > > > > to
> > >> > > > > > > > > > > cause
> > >> > > > > > > > > > > > > > > problems
> > >> > > > > > > > > > > > > > > > > as
> > >> > > > > > > > > > > > > > > > > > solve them by throttling these,
> right?
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > I'm not so sure that we should
> exempt
> > >> the
> > >> > > > > consumer
> > >> > > > > > > > > requests
> > >> > > > > > > > > > > > such
> > >> > > > > > > > > > > > > as
> > >> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we
> > >> throttle an
> > >> > > > app's
> > >> > > > > > > > > heartbeat
> > >> > > > > > > > > > > > > > requests
> > >> > > > > > > > > > > > > > > it
> > >> > > > > > > > > > > > > > > > > may
> > >> > > > > > > > > > > > > > > > > > cause it to fall out of its consumer
> > >> group.
> > >> > > > > However
> > >> > > > > > > if
> > >> > > > > > > > we
> > >> > > > > > > > > > > don't
> > >> > > > > > > > > > > > > > > > throttle
> > >> > > > > > > > > > > > > > > > > it
> > >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the
> > heartbeat
> > >> > > > interval
> > >> > > > > > is
> > >> > > > > > > > set
> > >> > > > > > > > > > > > > > incorrectly
> > >> > > > > > > > > > > > > > > or
> > >> > > > > > > > > > > > > > > > > if
> > >> > > > > > > > > > > > > > > > > > some client in some language has a
> > bug.
> > >> I
> > >> > > think
> > >> > > > > the
> > >> > > > > > > > > policy
> > >> > > > > > > > > > > with
> > >> > > > > > > > > > > > > > this
> > >> > > > > > > > > > > > > > > > kind
> > >> > > > > > > > > > > > > > > > > > of throttling is to protect the
> > cluster
> > >> > above
> > >> > > > any
> > >> > > > > > > > > > individual
> > >> > > > > > > > > > > > app,
> > >> > > > > > > > > > > > > > > > right?
> > >> > > > > > > > > > > > > > > > > I
> > >> > > > > > > > > > > > > > > > > > think in general this should be okay
> > >> since
> > >> > > for
> > >> > > > > most
> > >> > > > > > > > > > > deployments
> > >> > > > > > > > > > > > > > this
> > >> > > > > > > > > > > > > > > > > > setting is meant as more of a safety
> > >> > > > valve---that
> > >> > > > > > is
> > >> > > > > > > > > rather
> > >> > > > > > > > > > > > than
> > >> > > > > > > > > > > > > > set
> > >> > > > > > > > > > > > > > > > > > something very close to what you
> > expect
> > >> to
> > >> > > need
> > >> > > > > > (say
> > >> > > > > > > 2
> > >> > > > > > > > > > > req/sec
> > >> > > > > > > > > > > > or
> > >> > > > > > > > > > > > > > > > > whatever)
> > >> > > > > > > > > > > > > > > > > > you would have something quite high
> > >> (like
> > >> > 100
> > >> > > > > > > req/sec)
> > >> > > > > > > > > with
> > >> > > > > > > > > > > > this
> > >> > > > > > > > > > > > > > > meant
> > >> > > > > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I think
> > >> when
> > >> > > used
> > >> > > > > this
> > >> > > > > > > way
> > >> > > > > > > > > > > > allowing
> > >> > > > > > > > > > > > > > > those
> > >> > > > > > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > > > be throttled would actually provide
> > >> > > meaningful
> > >> > > > > > > > > protection.
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > -Jay
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM,
> > Rajini
> > >> > > > Sivaram <
> > >> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > wrote:
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > Hi all,
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > I have just created KIP-124 to
> > >> introduce
> > >> > > > > request
> > >> > > > > > > rate
> > >> > > > > > > > > > > quotas
> > >> > > > > > > > > > > > to
> > >> > > > > > > > > > > > > > > > Kafka:
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
> > >> > > > > > > > confluence/display/KAFKA/KIP-
> > >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > The proposal is for a simple
> > >> percentage
> > >> > > > request
> > >> > > > > > > > > handling
> > >> > > > > > > > > > > time
> > >> > > > > > > > > > > > > > quota
> > >> > > > > > > > > > > > > > > > > that
> > >> > > > > > > > > > > > > > > > > > > can be allocated to *<client-id>*,
> > >> > *<user>*
> > >> > > > or
> > >> > > > > > > > *<user,
> > >> > > > > > > > > > > > > > client-id>*.
> > >> > > > > > > > > > > > > > > > > There
> > >> > > > > > > > > > > > > > > > > > > are a few other suggestions also
> > under
> > >> > > > > "Rejected
> > >> > > > > > > > > > > > alternatives".
> > >> > > > > > > > > > > > > > > > > Feedback
> > >> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > Thank you...
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > Regards,
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > > > Rajini
> > >> > > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > > > --
> > >> > > > > > > > > > > > > > > -- Guozhang
> > >> > > > > > > > > > > > > > >
> > >> > > > > > > > > > > > > >
> > >> > > > > > > > > > > > >
> > >> > > > > > > > > > > >
> > >> > > > > > > > > > >
> > >> > > > > > > > > >
> > >> > > > > > > > >
> > >> > > > > > > >
> > >> > > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > >
> > >> > > > > > --
> > >> > > > > > -- Guozhang
> > >> > > > > >
> > >> > > > >
> > >> > > >
> > >> > >
> > >> >
> > >>
> > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by radai <ra...@gmail.com>.
@jun: i wasnt concerned about tying up a request processing thread, but
IIUC the code does still read the entire request out, which might add-up to
a non-negligible amount of memory.

On Thu, Feb 23, 2017 at 11:55 AM, Dong Lin <li...@gmail.com> wrote:

> Hey Rajini,
>
> The current KIP says that the maximum delay will be reduced to window size
> if it is larger than the window size. I have a concern with this:
>
> 1) This essentially means that the user is allowed to exceed their quota
> over a long period of time. Can you provide an upper bound on this
> deviation?
>
> 2) What is the motivation for cap the maximum delay by the window size? I
> am wondering if there is better alternative to address the problem.
>
> 3) It means that the existing metric-related config will have a more
> directly impact on the mechanism of this io-thread-unit-based quota. The
> may be an important change depending on the answer to 1) above. We probably
> need to document this more explicitly.
>
> Dong
>
>
> On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <li...@gmail.com> wrote:
>
> > Hey Jun,
> >
> > Yeah you are right. I thought it wasn't because at LinkedIn it will be
> too
> > much pressure on inGraph to expose those per-clientId metrics so we ended
> > up printing them periodically to local log. Never mind if it is not a
> > general problem.
> >
> > Hey Rajini,
> >
> > - I agree with Jay that we probably don't want to add a new field for
> > every quota ProduceResponse or FetchResponse. Is there any use-case for
> > having separate throttle-time fields for byte-rate-quota and
> > io-thread-unit-quota? You probably need to document this as interface
> > change if you plan to add new field in any request.
> >
> > - I don't think IOThread belongs to quotaType. The existing quota types
> > (i.e. Produce/Fetch/LeaderReplication/FollowerReplication) identify the
> > type of request that are throttled, not the quota mechanism that is
> applied.
> >
> > - If a request is throttled due to this io-thread-unit-based quota, is
> the
> > existing queue-size metric in ClientQuotaManager incremented?
> >
> > - In the interest of providing guide line for admin to decide
> > io-thread-unit-based quota and for user to understand its impact on their
> > traffic, would it be useful to have a metric that shows the overall
> > byte-rate per io-thread-unit? Can we also show this a per-clientId
> metric?
> >
> > Thanks,
> > Dong
> >
> >
> > On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io> wrote:
> >
> >> Hi, Ismael,
> >>
> >> For #3, typically, an admin won't configure more io threads than CPU
> >> cores,
> >> but it's possible for an admin to start with fewer io threads than cores
> >> and grow that later on.
> >>
> >> Hi, Dong,
> >>
> >> I think the throttleTime sensor on the broker tells the admin whether a
> >> user/clentId is throttled or not.
> >>
> >> Hi, Radi,
> >>
> >> The reasoning for delaying the throttled requests on the broker instead
> of
> >> returning an error immediately is that the latter has no way to prevent
> >> the
> >> client from retrying immediately, which will make things worse. The
> >> delaying logic is based off a delay queue. A separate expiration thread
> >> just waits on the next to be expired request. So, it doesn't tie up a
> >> request handler thread.
> >>
> >> Thanks,
> >>
> >> Jun
> >>
> >> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <is...@juma.me.uk> wrote:
> >>
> >> > Hi Jay,
> >> >
> >> > Regarding 1, I definitely like the simplicity of keeping a single
> >> throttle
> >> > time field in the response. The downside is that the client metrics
> >> will be
> >> > more coarse grained.
> >> >
> >> > Regarding 3, we have `leader.imbalance.per.broker.percentage` and
> >> > `log.cleaner.min.cleanable.ratio`.
> >> >
> >> > Ismael
> >> >
> >> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io> wrote:
> >> >
> >> > > A few minor comments:
> >> > >
> >> > >    1. Isn't it the case that the throttling time response field
> should
> >> > have
> >> > >    the total time your request was throttled irrespective of the
> >> quotas
> >> > > that
> >> > >    caused that. Limiting it to byte rate quota doesn't make sense,
> >> but I
> >> > > also
> >> > >    I don't think we want to end up adding new fields in the response
> >> for
> >> > > every
> >> > >    single thing we quota, right?
> >> > >    2. I don't think we should make this quota specifically about io
> >> > >    threads. Once we introduce these quotas people set them and
> expect
> >> > them
> >> > > to
> >> > >    be enforced (and if they aren't it may cause an outage). As a
> >> result
> >> > > they
> >> > >    are a bit more sensitive than normal configs, I think. The
> current
> >> > > thread
> >> > >    pools seem like something of an implementation detail and not the
> >> > level
> >> > > the
> >> > >    user-facing quotas should be involved with. I think it might be
> >> better
> >> > > to
> >> > >    make this a general request-time throttle with no mention in the
> >> > naming
> >> > >    about I/O threads and simply acknowledge the current limitation
> >> (which
> >> > > we
> >> > >    may someday fix) in the docs that this covers only the time after
> >> the
> >> > >    thread is read off the network.
> >> > >    3. As such I think the right interface to the user would be
> >> something
> >> > >    like percent_request_time and be in {0,...100} or
> >> request_time_ratio
> >> > > and be
> >> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we used if
> the
> >> > > scale
> >> > >    is between 0 and 1 in the other metrics, right?)
> >> > >
> >> > > -Jay
> >> > >
> >> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> >> rajinisivaram@gmail.com
> >> > >
> >> > > wrote:
> >> > >
> >> > > > Guozhang/Dong,
> >> > > >
> >> > > > Thank you for the feedback.
> >> > > >
> >> > > > Guozhang : I have updated the section on co-existence of byte rate
> >> and
> >> > > > request time quotas.
> >> > > >
> >> > > > Dong: I hadn't added much detail to the metrics and sensors since
> >> they
> >> > > are
> >> > > > going to be very similar to the existing metrics and sensors. To
> >> avoid
> >> > > > confusion, I have now added more detail. All metrics are in the
> >> group
> >> > > > "quotaType" and all sensors have names starting with "quotaType"
> >> (where
> >> > > > quotaType is Produce/Fetch/LeaderReplication/
> >> > > > FollowerReplication/*IOThread*).
> >> > > > So there will be no reuse of existing metrics/sensors. The new
> ones
> >> for
> >> > > > request processing time based throttling will be completely
> >> independent
> >> > > of
> >> > > > existing metrics/sensors, but will be consistent in format.
> >> > > >
> >> > > > The existing throttle_time_ms field in produce/fetch responses
> will
> >> not
> >> > > be
> >> > > > impacted by this KIP. That will continue to return byte-rate based
> >> > > > throttling times. In addition, a new field
> request_throttle_time_ms
> >> > will
> >> > > be
> >> > > > added to return request quota based throttling times. These will
> be
> >> > > exposed
> >> > > > as new metrics on the client-side.
> >> > > >
> >> > > > Since all metrics and sensors are different for each type of
> quota,
> >> I
> >> > > > believe there is already sufficient metrics to monitor throttling
> on
> >> > both
> >> > > > client and broker side for each type of throttling.
> >> > > >
> >> > > > Regards,
> >> > > >
> >> > > > Rajini
> >> > > >
> >> > > >
> >> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <li...@gmail.com>
> >> wrote:
> >> > > >
> >> > > > > Hey Rajini,
> >> > > > >
> >> > > > > I think it makes a lot of sense to use io_thread_units as metric
> >> to
> >> > > quota
> >> > > > > user's traffic here. LGTM overall. I have some questions
> regarding
> >> > > > sensors.
> >> > > > >
> >> > > > > - Can you be more specific in the KIP what sensors will be
> added?
> >> For
> >> > > > > example, it will be useful to specify the name and attributes of
> >> > these
> >> > > > new
> >> > > > > sensors.
> >> > > > >
> >> > > > > - We currently have throttle-time and queue-size for byte-rate
> >> based
> >> > > > quota.
> >> > > > > Are you going to have separate throttle-time and queue-size for
> >> > > requests
> >> > > > > throttled by io_thread_unit-based quota, or will they share the
> >> same
> >> > > > > sensor?
> >> > > > >
> >> > > > > - Does the throttle-time in the ProduceResponse and
> FetchResponse
> >> > > > contains
> >> > > > > time due to io_thread_unit-based quota?
> >> > > > >
> >> > > > > - Currently kafka server doesn't not provide any log or metrics
> >> that
> >> > > > tells
> >> > > > > whether any given clientId (or user) is throttled. This is not
> too
> >> > bad
> >> > > > > because we can still check the client-side byte-rate metric to
> >> > validate
> >> > > > > whether a given client is throttled. But with this
> io_thread_unit,
> >> > > there
> >> > > > > will be no way to validate whether a given client is slow
> because
> >> it
> >> > > has
> >> > > > > exceeded its io_thread_unit limit. It is necessary for user to
> be
> >> > able
> >> > > to
> >> > > > > know this information to figure how whether they have reached
> >> there
> >> > > quota
> >> > > > > limit. How about we add log4j log on the server side to
> >> periodically
> >> > > > print
> >> > > > > the (client_id, byte-rate-throttle-time,
> >> > io-thread-unit-throttle-time)
> >> > > so
> >> > > > > that kafka administrator can figure those users that have
> reached
> >> > their
> >> > > > > limit and act accordingly?
> >> > > > >
> >> > > > > Thanks,
> >> > > > > Dong
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > >
> >> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
> >> wangguoz@gmail.com>
> >> > > > wrote:
> >> > > > >
> >> > > > > > Made a pass over the doc, overall LGTM except a minor comment
> on
> >> > the
> >> > > > > > throttling implementation:
> >> > > > > >
> >> > > > > > Stated as "Request processing time throttling will be applied
> on
> >> > top
> >> > > if
> >> > > > > > necessary." I thought that it meant the request processing
> time
> >> > > > > throttling
> >> > > > > > is applied first, but continue reading I found it actually
> >> meant to
> >> > > > apply
> >> > > > > > produce / fetch byte rate throttling first.
> >> > > > > >
> >> > > > > > Also the last sentence "The remaining delay if any is applied
> to
> >> > the
> >> > > > > > response." is a bit confusing to me. Maybe rewording it a bit?
> >> > > > > >
> >> > > > > >
> >> > > > > > Guozhang
> >> > > > > >
> >> > > > > >
> >> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <ju...@confluent.io>
> >> wrote:
> >> > > > > >
> >> > > > > > > Hi, Rajini,
> >> > > > > > >
> >> > > > > > > Thanks for the updated KIP. The latest proposal looks good
> to
> >> me.
> >> > > > > > >
> >> > > > > > > Jun
> >> > > > > > >
> >> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> >> > > > > rajinisivaram@gmail.com
> >> > > > > > >
> >> > > > > > > wrote:
> >> > > > > > >
> >> > > > > > > > Jun/Roger,
> >> > > > > > > >
> >> > > > > > > > Thank you for the feedback.
> >> > > > > > > >
> >> > > > > > > > 1. I have updated the KIP to use absolute units instead of
> >> > > > > percentage.
> >> > > > > > > The
> >> > > > > > > > property is called* io_thread_units* to align with the
> >> thread
> >> > > count
> >> > > > > > > > property *num.io.threads*. When we implement network
> thread
> >> > > > > utilization
> >> > > > > > > > quotas, we can add another property
> *network_thread_units.*
> >> > > > > > > >
> >> > > > > > > > 2. ControlledShutdown is already listed under the exempt
> >> > > requests.
> >> > > > > Jun,
> >> > > > > > > did
> >> > > > > > > > you mean a different request that needs to be added? The
> >> four
> >> > > > > requests
> >> > > > > > > > currently exempt in the KIP are StopReplica,
> >> > ControlledShutdown,
> >> > > > > > > > LeaderAndIsr and UpdateMetadata. These are controlled
> using
> >> > > > > > ClusterAction
> >> > > > > > > > ACL, so it is easy to exclude and only throttle if
> >> > unauthorized.
> >> > > I
> >> > > > > > wasn't
> >> > > > > > > > sure if there are other requests used only for
> inter-broker
> >> > that
> >> > > > > needed
> >> > > > > > > to
> >> > > > > > > > be excluded.
> >> > > > > > > >
> >> > > > > > > > 3. I was thinking the smallest change would be to replace
> >> all
> >> > > > > > references
> >> > > > > > > to
> >> > > > > > > > *requestChannel.sendResponse()* with a local method
> >> > > > > > > > *sendResponseMaybeThrottle()* that does the throttling if
> >> any
> >> > > plus
> >> > > > > send
> >> > > > > > > > response. If we throttle first in *KafkaApis.handle()*,
> the
> >> > time
> >> > > > > spent
> >> > > > > > > > within the method handling the request will not be
> recorded
> >> or
> >> > > used
> >> > > > > in
> >> > > > > > > > throttling. We can look into this again when the PR is
> ready
> >> > for
> >> > > > > > review.
> >> > > > > > > >
> >> > > > > > > > Regards,
> >> > > > > > > >
> >> > > > > > > > Rajini
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > >
> >> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> >> > > > > roger.hoover@gmail.com>
> >> > > > > > > > wrote:
> >> > > > > > > >
> >> > > > > > > > > Great to see this KIP and the excellent discussion.
> >> > > > > > > > >
> >> > > > > > > > > To me, Jun's suggestion makes sense.  If my application
> is
> >> > > > > allocated
> >> > > > > > 1
> >> > > > > > > > > request handler unit, then it's as if I have a Kafka
> >> broker
> >> > > with
> >> > > > a
> >> > > > > > > single
> >> > > > > > > > > request handler thread dedicated to me.  That's the
> most I
> >> > can
> >> > > > use,
> >> > > > > > at
> >> > > > > > > > > least.  That allocation doesn't change even if an admin
> >> later
> >> > > > > > increases
> >> > > > > > > > the
> >> > > > > > > > > size of the request thread pool on the broker.  It's
> >> similar
> >> > to
> >> > > > the
> >> > > > > > CPU
> >> > > > > > > > > abstraction that VMs and containers get from hypervisors
> >> or
> >> > OS
> >> > > > > > > > schedulers.
> >> > > > > > > > > While different client access patterns can use wildly
> >> > different
> >> > > > > > amounts
> >> > > > > > > > of
> >> > > > > > > > > request thread resources per request, a given
> application
> >> > will
> >> > > > > > > generally
> >> > > > > > > > > have a stable access pattern and can figure out
> >> empirically
> >> > how
> >> > > > > many
> >> > > > > > > > > "request thread units" it needs to meet it's
> >> > throughput/latency
> >> > > > > > goals.
> >> > > > > > > > >
> >> > > > > > > > > Cheers,
> >> > > > > > > > >
> >> > > > > > > > > Roger
> >> > > > > > > > >
> >> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
> >> jun@confluent.io>
> >> > > > wrote:
> >> > > > > > > > >
> >> > > > > > > > > > Hi, Rajini,
> >> > > > > > > > > >
> >> > > > > > > > > > Thanks for the updated KIP. A few more comments.
> >> > > > > > > > > >
> >> > > > > > > > > > 1. A concern of request_time_percent is that it's not
> an
> >> > > > absolute
> >> > > > > > > > value.
> >> > > > > > > > > > Let's say you give a user a 10% limit. If the admin
> >> doubles
> >> > > the
> >> > > > > > > number
> >> > > > > > > > of
> >> > > > > > > > > > request handler threads, that user now actually has
> >> twice
> >> > the
> >> > > > > > > absolute
> >> > > > > > > > > > capacity. This may confuse people a bit. So, perhaps
> >> > setting
> >> > > > the
> >> > > > > > > quota
> >> > > > > > > > > > based on an absolute request thread unit is better.
> >> > > > > > > > > >
> >> > > > > > > > > > 2. ControlledShutdownRequest is also an inter-broker
> >> > request
> >> > > > and
> >> > > > > > > needs
> >> > > > > > > > to
> >> > > > > > > > > > be excluded from throttling.
> >> > > > > > > > > >
> >> > > > > > > > > > 3. Implementation wise, I am wondering if it's simpler
> >> to
> >> > > apply
> >> > > > > the
> >> > > > > > > > > request
> >> > > > > > > > > > time throttling first in KafkaApis.handle().
> Otherwise,
> >> we
> >> > > will
> >> > > > > > need
> >> > > > > > > to
> >> > > > > > > > > add
> >> > > > > > > > > > the throttling logic in each type of request.
> >> > > > > > > > > >
> >> > > > > > > > > > Thanks,
> >> > > > > > > > > >
> >> > > > > > > > > > Jun
> >> > > > > > > > > >
> >> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> >> > > > > > > > rajinisivaram@gmail.com
> >> > > > > > > > > >
> >> > > > > > > > > > wrote:
> >> > > > > > > > > >
> >> > > > > > > > > > > Jun,
> >> > > > > > > > > > >
> >> > > > > > > > > > > Thank you for the review.
> >> > > > > > > > > > >
> >> > > > > > > > > > > I have reverted to the original KIP that throttles
> >> based
> >> > on
> >> > > > > > request
> >> > > > > > > > > > handler
> >> > > > > > > > > > > utilization. At the moment, it uses percentage, but
> I
> >> am
> >> > > > happy
> >> > > > > to
> >> > > > > > > > > change
> >> > > > > > > > > > to
> >> > > > > > > > > > > a fraction (out of 1 instead of 100) if required. I
> >> have
> >> > > > added
> >> > > > > > the
> >> > > > > > > > > > examples
> >> > > > > > > > > > > from this discussion to the KIP. Also added a
> "Future
> >> > Work"
> >> > > > > > section
> >> > > > > > > > to
> >> > > > > > > > > > > address network thread utilization. The
> configuration
> >> is
> >> > > > named
> >> > > > > > > > > > > "request_time_percent" with the expectation that it
> >> can
> >> > > also
> >> > > > be
> >> > > > > > > used
> >> > > > > > > > as
> >> > > > > > > > > > the
> >> > > > > > > > > > > limit for network thread utilization when that is
> >> > > > implemented,
> >> > > > > so
> >> > > > > > > > that
> >> > > > > > > > > > > users have to set only one config for the two and
> not
> >> > have
> >> > > to
> >> > > > > > worry
> >> > > > > > > > > about
> >> > > > > > > > > > > the internal distribution of the work between the
> two
> >> > > thread
> >> > > > > > pools
> >> > > > > > > in
> >> > > > > > > > > > > Kafka.
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > Regards,
> >> > > > > > > > > > >
> >> > > > > > > > > > > Rajini
> >> > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
> >> > > jun@confluent.io>
> >> > > > > > > wrote:
> >> > > > > > > > > > >
> >> > > > > > > > > > > > Hi, Rajini,
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Thanks for the proposal.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > The benefit of using the request processing time
> >> over
> >> > the
> >> > > > > > request
> >> > > > > > > > > rate
> >> > > > > > > > > > is
> >> > > > > > > > > > > > exactly what people have said. I will just expand
> >> that
> >> > a
> >> > > > bit.
> >> > > > > > > > > Consider
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > following case. The producer sends a produce
> request
> >> > > with a
> >> > > > > > 10MB
> >> > > > > > > > > > message
> >> > > > > > > > > > > > but compressed to 100KB with gzip. The
> >> decompression of
> >> > > the
> >> > > > > > > message
> >> > > > > > > > > on
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > broker could take 10-15 seconds, during which
> time,
> >> a
> >> > > > request
> >> > > > > > > > handler
> >> > > > > > > > > > > > thread is completely blocked. In this case,
> neither
> >> the
> >> > > > > byte-in
> >> > > > > > > > quota
> >> > > > > > > > > > nor
> >> > > > > > > > > > > > the request rate quota may be effective in
> >> protecting
> >> > the
> >> > > > > > broker.
> >> > > > > > > > > > > Consider
> >> > > > > > > > > > > > another case. A consumer group starts with 10
> >> instances
> >> > > and
> >> > > > > > later
> >> > > > > > > > on
> >> > > > > > > > > > > > switches to 20 instances. The request rate will
> >> likely
> >> > > > > double,
> >> > > > > > > but
> >> > > > > > > > > the
> >> > > > > > > > > > > > actually load on the broker may not double since
> >> each
> >> > > fetch
> >> > > > > > > request
> >> > > > > > > > > > only
> >> > > > > > > > > > > > contains half of the partitions. Request rate
> quota
> >> may
> >> > > not
> >> > > > > be
> >> > > > > > > easy
> >> > > > > > > > > to
> >> > > > > > > > > > > > configure in this case.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > What we really want is to be able to prevent a
> >> client
> >> > > from
> >> > > > > > using
> >> > > > > > > > too
> >> > > > > > > > > > much
> >> > > > > > > > > > > > of the server side resources. In this particular
> >> KIP,
> >> > > this
> >> > > > > > > resource
> >> > > > > > > > > is
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > capacity of the request handler threads. I agree
> >> that
> >> > it
> >> > > > may
> >> > > > > > not
> >> > > > > > > be
> >> > > > > > > > > > > > intuitive for the users to determine how to set
> the
> >> > right
> >> > > > > > limit.
> >> > > > > > > > > > However,
> >> > > > > > > > > > > > this is not completely new and has been done in
> the
> >> > > > container
> >> > > > > > > world
> >> > > > > > > > > > > > already. For example, Linux cgroup (
> >> > > > > https://access.redhat.com/
> >> > > > > > > > > > > > documentation/en-US/Red_Hat_En
> >> terprise_Linux/6/html/
> >> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html) has the
> >> > concept
> >> > > of
> >> > > > > > > > > > > > cpu.cfs_quota_us,
> >> > > > > > > > > > > > which specifies the total amount of time in
> >> > microseconds
> >> > > > for
> >> > > > > > > which
> >> > > > > > > > > all
> >> > > > > > > > > > > > tasks in a cgroup can run during a one second
> >> period.
> >> > We
> >> > > > can
> >> > > > > > > > > > potentially
> >> > > > > > > > > > > > model the request handler threads in a similar
> way.
> >> For
> >> > > > > > example,
> >> > > > > > > > each
> >> > > > > > > > > > > > request handler thread can be 1 request handler
> unit
> >> > and
> >> > > > the
> >> > > > > > > admin
> >> > > > > > > > > can
> >> > > > > > > > > > > > configure a limit on how many units (say 0.01) a
> >> client
> >> > > can
> >> > > > > > have.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Regarding not throttling the internal broker to
> >> broker
> >> > > > > > requests.
> >> > > > > > > We
> >> > > > > > > > > > could
> >> > > > > > > > > > > > do that. Alternatively, we could just let the
> admin
> >> > > > > configure a
> >> > > > > > > > high
> >> > > > > > > > > > > limit
> >> > > > > > > > > > > > for the kafka user (it may not be able to do that
> >> > easily
> >> > > > > based
> >> > > > > > on
> >> > > > > > > > > > > clientId
> >> > > > > > > > > > > > though).
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Ideally we want to be able to protect the
> >> utilization
> >> > of
> >> > > > the
> >> > > > > > > > network
> >> > > > > > > > > > > thread
> >> > > > > > > > > > > > pool too. The difficult is mostly what Rajini
> said:
> >> (1)
> >> > > The
> >> > > > > > > > mechanism
> >> > > > > > > > > > for
> >> > > > > > > > > > > > throttling the requests is through Purgatory and
> we
> >> > will
> >> > > > have
> >> > > > > > to
> >> > > > > > > > > think
> >> > > > > > > > > > > > through how to integrate that into the network
> >> layer.
> >> > > (2)
> >> > > > In
> >> > > > > > the
> >> > > > > > > > > > network
> >> > > > > > > > > > > > layer, currently we know the user, but not the
> >> clientId
> >> > > of
> >> > > > > the
> >> > > > > > > > > request.
> >> > > > > > > > > > > So,
> >> > > > > > > > > > > > it's a bit tricky to throttle based on clientId
> >> there.
> >> > > > Plus,
> >> > > > > > the
> >> > > > > > > > > > byteOut
> >> > > > > > > > > > > > quota can already protect the network thread
> >> > utilization
> >> > > > for
> >> > > > > > > fetch
> >> > > > > > > > > > > > requests. So, if we can't figure out this part
> right
> >> > now,
> >> > > > > just
> >> > > > > > > > > focusing
> >> > > > > > > > > > > on
> >> > > > > > > > > > > > the request handling threads for this KIP is
> still a
> >> > > useful
> >> > > > > > > > feature.
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Thanks,
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > Jun
> >> > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> >> > > > > > > > > > rajinisivaram@gmail.com
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > wrote:
> >> > > > > > > > > > > >
> >> > > > > > > > > > > > > Thank you all for the feedback.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Jay: I have removed exemption for consumer
> >> heartbeat
> >> > > etc.
> >> > > > > > Agree
> >> > > > > > > > > that
> >> > > > > > > > > > > > > protecting the cluster is more important than
> >> > > protecting
> >> > > > > > > > individual
> >> > > > > > > > > > > apps.
> >> > > > > > > > > > > > > Have retained the exemption for
> >> > > StopReplicat/LeaderAndIsr
> >> > > > > > etc,
> >> > > > > > > > > these
> >> > > > > > > > > > > are
> >> > > > > > > > > > > > > throttled only if authorization fails (so can't
> be
> >> > used
> >> > > > for
> >> > > > > > DoS
> >> > > > > > > > > > attacks
> >> > > > > > > > > > > > in
> >> > > > > > > > > > > > > a secure cluster, but allows inter-broker
> >> requests to
> >> > > > > > complete
> >> > > > > > > > > > without
> >> > > > > > > > > > > > > delays).
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > I will wait another day to see if these is any
> >> > > objection
> >> > > > to
> >> > > > > > > > quotas
> >> > > > > > > > > > > based
> >> > > > > > > > > > > > on
> >> > > > > > > > > > > > > request processing time (as opposed to request
> >> rate)
> >> > > and
> >> > > > if
> >> > > > > > > there
> >> > > > > > > > > are
> >> > > > > > > > > > > no
> >> > > > > > > > > > > > > objections, I will revert to the original
> proposal
> >> > with
> >> > > > > some
> >> > > > > > > > > changes.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > The original proposal was only including the
> time
> >> > used
> >> > > by
> >> > > > > the
> >> > > > > > > > > request
> >> > > > > > > > > > > > > handler threads (that made calculation easy). I
> >> think
> >> > > the
> >> > > > > > > > > suggestion
> >> > > > > > > > > > is
> >> > > > > > > > > > > > to
> >> > > > > > > > > > > > > include the time spent in the network threads as
> >> well
> >> > > > since
> >> > > > > > > that
> >> > > > > > > > > may
> >> > > > > > > > > > be
> >> > > > > > > > > > > > > significant. As Jay pointed out, it is more
> >> > complicated
> >> > > > to
> >> > > > > > > > > calculate
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > > total available CPU time and convert to a ratio
> >> when
> >> > > > there
> >> > > > > > *m*
> >> > > > > > > > I/O
> >> > > > > > > > > > > > threads
> >> > > > > > > > > > > > > and *n* network threads.
> >> > ThreadMXBean#getThreadCPUTime(
> >> > > )
> >> > > > > may
> >> > > > > > > > give
> >> > > > > > > > > us
> >> > > > > > > > > > > > what
> >> > > > > > > > > > > > > we want, but it can be very expensive on some
> >> > > platforms.
> >> > > > As
> >> > > > > > > > Becket
> >> > > > > > > > > > and
> >> > > > > > > > > > > > > Guozhang have pointed out, we do have several
> time
> >> > > > > > measurements
> >> > > > > > > > > > already
> >> > > > > > > > > > > > for
> >> > > > > > > > > > > > > generating metrics that we could use, though we
> >> might
> >> > > > want
> >> > > > > to
> >> > > > > > > > > switch
> >> > > > > > > > > > to
> >> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis() since
> >> some
> >> > of
> >> > > > the
> >> > > > > > > > values
> >> > > > > > > > > > for
> >> > > > > > > > > > > > > small requests may be < 1ms. But rather than add
> >> up
> >> > the
> >> > > > > time
> >> > > > > > > > spent
> >> > > > > > > > > in
> >> > > > > > > > > > > I/O
> >> > > > > > > > > > > > > thread and network thread, wouldn't it be better
> >> to
> >> > > > convert
> >> > > > > > the
> >> > > > > > > > > time
> >> > > > > > > > > > > > spent
> >> > > > > > > > > > > > > on each thread into a separate ratio? UserA has
> a
> >> > > request
> >> > > > > > quota
> >> > > > > > > > of
> >> > > > > > > > > > 5%.
> >> > > > > > > > > > > > Can
> >> > > > > > > > > > > > > we take that to mean that UserA can use 5% of
> the
> >> > time
> >> > > on
> >> > > > > > > network
> >> > > > > > > > > > > threads
> >> > > > > > > > > > > > > and 5% of the time on I/O threads? If either is
> >> > > exceeded,
> >> > > > > the
> >> > > > > > > > > > response
> >> > > > > > > > > > > is
> >> > > > > > > > > > > > > throttled - it would mean maintaining two sets
> of
> >> > > metrics
> >> > > > > for
> >> > > > > > > the
> >> > > > > > > > > two
> >> > > > > > > > > > > > > durations, but would result in more meaningful
> >> > ratios.
> >> > > We
> >> > > > > > could
> >> > > > > > > > > > define
> >> > > > > > > > > > > > two
> >> > > > > > > > > > > > > quota limits (UserA has 5% of request threads
> and
> >> 10%
> >> > > of
> >> > > > > > > network
> >> > > > > > > > > > > > threads),
> >> > > > > > > > > > > > > but that seems unnecessary and harder to explain
> >> to
> >> > > > users.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Back to why and how quotas are applied to
> network
> >> > > thread
> >> > > > > > > > > utilization:
> >> > > > > > > > > > > > > a) In the case of fetch,  the time spent in the
> >> > network
> >> > > > > > thread
> >> > > > > > > > may
> >> > > > > > > > > be
> >> > > > > > > > > > > > > significant and I can see the need to include
> >> this.
> >> > Are
> >> > > > > there
> >> > > > > > > > other
> >> > > > > > > > > > > > > requests where the network thread utilization is
> >> > > > > significant?
> >> > > > > > > In
> >> > > > > > > > > the
> >> > > > > > > > > > > case
> >> > > > > > > > > > > > > of fetch, request handler thread utilization
> would
> >> > > > throttle
> >> > > > > > > > clients
> >> > > > > > > > > > > with
> >> > > > > > > > > > > > > high request rate, low data volume and fetch
> byte
> >> > rate
> >> > > > > quota
> >> > > > > > > will
> >> > > > > > > > > > > > throttle
> >> > > > > > > > > > > > > clients with high data volume. Network thread
> >> > > utilization
> >> > > > > is
> >> > > > > > > > > perhaps
> >> > > > > > > > > > > > > proportional to the data volume. I am wondering
> >> if we
> >> > > > even
> >> > > > > > need
> >> > > > > > > > to
> >> > > > > > > > > > > > throttle
> >> > > > > > > > > > > > > based on network thread utilization or whether
> the
> >> > data
> >> > > > > > volume
> >> > > > > > > > > quota
> >> > > > > > > > > > > > covers
> >> > > > > > > > > > > > > this case.
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > b) At the moment, we record and check for quota
> >> > > violation
> >> > > > > at
> >> > > > > > > the
> >> > > > > > > > > same
> >> > > > > > > > > > > > time.
> >> > > > > > > > > > > > > If a quota is violated, the response is delayed.
> >> > Using
> >> > > > > Jay'e
> >> > > > > > > > > example
> >> > > > > > > > > > of
> >> > > > > > > > > > > > > disk reads for fetches happening in the network
> >> > thread,
> >> > > > We
> >> > > > > > > can't
> >> > > > > > > > > > record
> >> > > > > > > > > > > > and
> >> > > > > > > > > > > > > delay a response after the disk reads. We could
> >> > record
> >> > > > the
> >> > > > > > time
> >> > > > > > > > > spent
> >> > > > > > > > > > > on
> >> > > > > > > > > > > > > the network thread when the response is complete
> >> and
> >> > > > > > introduce
> >> > > > > > > a
> >> > > > > > > > > > delay
> >> > > > > > > > > > > > for
> >> > > > > > > > > > > > > handling a subsequent request (separate out
> >> recording
> >> > > and
> >> > > > > > quota
> >> > > > > > > > > > > violation
> >> > > > > > > > > > > > > handling in the case of network thread
> overload).
> >> > Does
> >> > > > that
> >> > > > > > > make
> >> > > > > > > > > > sense?
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Regards,
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > Rajini
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> >> > > > > > > > becket.qin@gmail.com>
> >> > > > > > > > > > > > wrote:
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Hey Jay,
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU time is a
> >> > little
> >> > > > > > > tricky. I
> >> > > > > > > > > am
> >> > > > > > > > > > > > > thinking
> >> > > > > > > > > > > > > > that maybe we can use the existing request
> >> > > statistics.
> >> > > > > They
> >> > > > > > > are
> >> > > > > > > > > > > already
> >> > > > > > > > > > > > > > very detailed so we can probably see the
> >> > approximate
> >> > > > CPU
> >> > > > > > time
> >> > > > > > > > > from
> >> > > > > > > > > > > it,
> >> > > > > > > > > > > > > e.g.
> >> > > > > > > > > > > > > > something like (total_time -
> >> > > > request/response_queue_time
> >> > > > > -
> >> > > > > > > > > > > > remote_time).
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > I agree with Guozhang that when a user is
> >> throttled
> >> > > it
> >> > > > is
> >> > > > > > > > likely
> >> > > > > > > > > > that
> >> > > > > > > > > > > > we
> >> > > > > > > > > > > > > > need to see if anything has went wrong first,
> >> and
> >> > if
> >> > > > the
> >> > > > > > > users
> >> > > > > > > > > are
> >> > > > > > > > > > > well
> >> > > > > > > > > > > > > > behaving and just need more resources, we will
> >> have
> >> > > to
> >> > > > > bump
> >> > > > > > > up
> >> > > > > > > > > the
> >> > > > > > > > > > > > quota
> >> > > > > > > > > > > > > > for them. It is true that pre-allocating CPU
> >> time
> >> > > quota
> >> > > > > > > > precisely
> >> > > > > > > > > > for
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > > > users is difficult. So in practice it would
> >> > probably
> >> > > be
> >> > > > > > more
> >> > > > > > > > like
> >> > > > > > > > > > > first
> >> > > > > > > > > > > > > set
> >> > > > > > > > > > > > > > a relative high protective CPU time quota for
> >> > > everyone
> >> > > > > and
> >> > > > > > > > > increase
> >> > > > > > > > > > > > that
> >> > > > > > > > > > > > > > for some individual clients on demand.
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Thanks,
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang
> Wang <
> >> > > > > > > > > wangguoz@gmail.com
> >> > > > > > > > > > >
> >> > > > > > > > > > > > > wrote:
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > This is a great proposal, glad to see it
> >> > happening.
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > I am inclined to the CPU throttling, or more
> >> > > > > specifically
> >> > > > > > > > > > > processing
> >> > > > > > > > > > > > > time
> >> > > > > > > > > > > > > > > ratio instead of the request rate throttling
> >> as
> >> > > well.
> >> > > > > > > Becket
> >> > > > > > > > > has
> >> > > > > > > > > > > very
> >> > > > > > > > > > > > > > well
> >> > > > > > > > > > > > > > > summed my rationales above, and one thing to
> >> add
> >> > > here
> >> > > > > is
> >> > > > > > > that
> >> > > > > > > > > the
> >> > > > > > > > > > > > > former
> >> > > > > > > > > > > > > > > has a good support for both "protecting
> >> against
> >> > > rogue
> >> > > > > > > > clients"
> >> > > > > > > > > as
> >> > > > > > > > > > > > well
> >> > > > > > > > > > > > > as
> >> > > > > > > > > > > > > > > "utilizing a cluster for multi-tenancy
> usage":
> >> > when
> >> > > > > > > thinking
> >> > > > > > > > > > about
> >> > > > > > > > > > > > how
> >> > > > > > > > > > > > > to
> >> > > > > > > > > > > > > > > explain this to the end users, I find it
> >> actually
> >> > > > more
> >> > > > > > > > natural
> >> > > > > > > > > > than
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > > > > request rate since as mentioned above,
> >> different
> >> > > > > requests
> >> > > > > > > > will
> >> > > > > > > > > > have
> >> > > > > > > > > > > > > quite
> >> > > > > > > > > > > > > > > different "cost", and Kafka today already
> have
> >> > > > various
> >> > > > > > > > request
> >> > > > > > > > > > > types
> >> > > > > > > > > > > > > > > (produce, fetch, admin, metadata, etc),
> >> because
> >> > of
> >> > > > that
> >> > > > > > the
> >> > > > > > > > > > request
> >> > > > > > > > > > > > > rate
> >> > > > > > > > > > > > > > > throttling may not be as effective unless it
> >> is
> >> > set
> >> > > > > very
> >> > > > > > > > > > > > > conservatively.
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > Regarding to user reactions when they are
> >> > > throttled,
> >> > > > I
> >> > > > > > > think
> >> > > > > > > > it
> >> > > > > > > > > > may
> >> > > > > > > > > > > > > > differ
> >> > > > > > > > > > > > > > > case-by-case, and need to be discovered /
> >> guided
> >> > by
> >> > > > > > looking
> >> > > > > > > > at
> >> > > > > > > > > > > > relative
> >> > > > > > > > > > > > > > > metrics. So in other words users would not
> >> expect
> >> > > to
> >> > > > > get
> >> > > > > > > > > > additional
> >> > > > > > > > > > > > > > > information by simply being told "hey, you
> are
> >> > > > > > throttled",
> >> > > > > > > > > which
> >> > > > > > > > > > is
> >> > > > > > > > > > > > all
> >> > > > > > > > > > > > > > > what throttling does; they need to take a
> >> > follow-up
> >> > > > > step
> >> > > > > > > and
> >> > > > > > > > > see
> >> > > > > > > > > > > > "hmm,
> >> > > > > > > > > > > > > > I'm
> >> > > > > > > > > > > > > > > throttled probably because of ..", which is
> by
> >> > > > looking
> >> > > > > at
> >> > > > > > > > other
> >> > > > > > > > > > > > metric
> >> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding the
> >> brokers
> >> > > with
> >> > > > > > > metadata
> >> > > > > > > > > > > > request,
> >> > > > > > > > > > > > > > > which are usually cheap to handle but I'm
> >> sending
> >> > > > > > thousands
> >> > > > > > > > per
> >> > > > > > > > > > > > second;
> >> > > > > > > > > > > > > > or
> >> > > > > > > > > > > > > > > is it because I'm catching up and hence
> >> sending
> >> > > very
> >> > > > > > heavy
> >> > > > > > > > > > fetching
> >> > > > > > > > > > > > > > request
> >> > > > > > > > > > > > > > > with large min.bytes, etc.
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > Regarding to the implementation, as once
> >> > discussed
> >> > > > with
> >> > > > > > > Jun,
> >> > > > > > > > > this
> >> > > > > > > > > > > > seems
> >> > > > > > > > > > > > > > not
> >> > > > > > > > > > > > > > > very difficult since today we are already
> >> > > collecting
> >> > > > > the
> >> > > > > > > > > "thread
> >> > > > > > > > > > > pool
> >> > > > > > > > > > > > > > > utilization" metrics, which is a single
> >> > percentage
> >> > > > > > > > > > > > "aggregateIdleMeter"
> >> > > > > > > > > > > > > > > value; but we are already effectively
> >> aggregating
> >> > > it
> >> > > > > for
> >> > > > > > > each
> >> > > > > > > > > > > > requests
> >> > > > > > > > > > > > > in
> >> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just extend
> >> it by
> >> > > > > > recording
> >> > > > > > > > the
> >> > > > > > > > > > > > source
> >> > > > > > > > > > > > > > > client id when handling them and aggregating
> >> by
> >> > > > > clientId
> >> > > > > > as
> >> > > > > > > > > well
> >> > > > > > > > > > as
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > > > > total aggregate.
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > Guozhang
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <
> >> > > > > > > jay@confluent.io
> >> > > > > > > > >
> >> > > > > > > > > > > wrote:
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > Hey Becket/Rajini,
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > When I thought about it more deeply I came
> >> > around
> >> > > > to
> >> > > > > > the
> >> > > > > > > > > > "percent
> >> > > > > > > > > > > > of
> >> > > > > > > > > > > > > > > > processing time" metric too. It seems a
> lot
> >> > > closer
> >> > > > to
> >> > > > > > the
> >> > > > > > > > > thing
> >> > > > > > > > > > > we
> >> > > > > > > > > > > > > > > actually
> >> > > > > > > > > > > > > > > > care about and need to protect. I also
> think
> >> > this
> >> > > > > would
> >> > > > > > > be
> >> > > > > > > > a
> >> > > > > > > > > > very
> >> > > > > > > > > > > > > > useful
> >> > > > > > > > > > > > > > > > metric even in the absence of throttling
> >> just
> >> > to
> >> > > > > debug
> >> > > > > > > > whose
> >> > > > > > > > > > > using
> >> > > > > > > > > > > > > > > > capacity.
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > Two problems to consider:
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > >    1. I agree that for the user it is
> >> > > > understandable
> >> > > > > > what
> >> > > > > > > > > lead
> >> > > > > > > > > > to
> >> > > > > > > > > > > > > their
> >> > > > > > > > > > > > > > > >    being throttled, but it is a bit hard
> to
> >> > > figure
> >> > > > > out
> >> > > > > > > the
> >> > > > > > > > > safe
> >> > > > > > > > > > > > range
> >> > > > > > > > > > > > > > for
> >> > > > > > > > > > > > > > > >    them. i.e. if I have a new app that
> will
> >> > send
> >> > > > 200
> >> > > > > > > > > > > messages/sec I
> >> > > > > > > > > > > > > can
> >> > > > > > > > > > > > > > > >    probably reason that I'll be under the
> >> > > > throttling
> >> > > > > > > limit
> >> > > > > > > > of
> >> > > > > > > > > > 300
> >> > > > > > > > > > > > > > > req/sec.
> >> > > > > > > > > > > > > > > >    However if I need to be under a 10% CPU
> >> > > > resources
> >> > > > > > > limit
> >> > > > > > > > it
> >> > > > > > > > > > may
> >> > > > > > > > > > > > be
> >> > > > > > > > > > > > > a
> >> > > > > > > > > > > > > > > bit
> >> > > > > > > > > > > > > > > >    harder for me to know a priori if i
> will
> >> or
> >> > > > won't.
> >> > > > > > > > > > > > > > > >    2. Calculating the available CPU time
> is
> >> a
> >> > bit
> >> > > > > > > difficult
> >> > > > > > > > > > since
> >> > > > > > > > > > > > > there
> >> > > > > > > > > > > > > > > are
> >> > > > > > > > > > > > > > > >    actually two thread pools--the I/O
> >> threads
> >> > and
> >> > > > the
> >> > > > > > > > network
> >> > > > > > > > > > > > > threads.
> >> > > > > > > > > > > > > > I
> >> > > > > > > > > > > > > > > > think
> >> > > > > > > > > > > > > > > >    it might be workable to count just the
> >> I/O
> >> > > > thread
> >> > > > > > time
> >> > > > > > > > as
> >> > > > > > > > > in
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > > > > > proposal,
> >> > > > > > > > > > > > > > > >    but the network thread work is actually
> >> > > > > non-trivial
> >> > > > > > > > (e.g.
> >> > > > > > > > > > all
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > > > disk
> >> > > > > > > > > > > > > > > >    reads for fetches happen in that
> >> thread). If
> >> > > you
> >> > > > > > count
> >> > > > > > > > > both
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > > > > network
> >> > > > > > > > > > > > > > > > and
> >> > > > > > > > > > > > > > > >    I/O threads it can skew things a bit.
> >> E.g.
> >> > say
> >> > > > you
> >> > > > > > > have
> >> > > > > > > > 50
> >> > > > > > > > > > > > network
> >> > > > > > > > > > > > > > > > threads,
> >> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores, what is
> the
> >> > > > available
> >> > > > > > cpu
> >> > > > > > > > > time
> >> > > > > > > > > > > > > > available
> >> > > > > > > > > > > > > > > > in a
> >> > > > > > > > > > > > > > > >    second? I suppose this is a problem
> >> whenever
> >> > > you
> >> > > > > > have
> >> > > > > > > a
> >> > > > > > > > > > > > bottleneck
> >> > > > > > > > > > > > > > > > between
> >> > > > > > > > > > > > > > > >    I/O and network threads or if you end
> up
> >> > > > > > significantly
> >> > > > > > > > > > > > > > > over-provisioning
> >> > > > > > > > > > > > > > > >    one pool (both of which are hard to
> >> avoid).
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > An alternative for CPU throttling would be
> >> to
> >> > use
> >> > > > > this
> >> > > > > > > api:
> >> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> >> > > > > > 1.5.0/docs/api/java/lang/
> >> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
> >> > > > getThreadCpuTime(long)
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > That would let you track actual CPU usage
> >> > across
> >> > > > the
> >> > > > > > > > network,
> >> > > > > > > > > > I/O
> >> > > > > > > > > > > > > > > threads,
> >> > > > > > > > > > > > > > > > and purgatory threads and look at it as a
> >> > > > percentage
> >> > > > > of
> >> > > > > > > > total
> >> > > > > > > > > > > > cores.
> >> > > > > > > > > > > > > I
> >> > > > > > > > > > > > > > > > think this fixes many problems in the
> >> > reliability
> >> > > > of
> >> > > > > > the
> >> > > > > > > > > > metric.
> >> > > > > > > > > > > > It's
> >> > > > > > > > > > > > > > > > meaning is slightly different as it is
> just
> >> CPU
> >> > > > (you
> >> > > > > > > don't
> >> > > > > > > > > get
> >> > > > > > > > > > > > > charged
> >> > > > > > > > > > > > > > > for
> >> > > > > > > > > > > > > > > > time blocking on I/O) but that may be okay
> >> > > because
> >> > > > we
> >> > > > > > > > already
> >> > > > > > > > > > > have
> >> > > > > > > > > > > > a
> >> > > > > > > > > > > > > > > > throttle on I/O. The downside is I think
> it
> >> is
> >> > > > > possible
> >> > > > > > > > this
> >> > > > > > > > > > api
> >> > > > > > > > > > > > can
> >> > > > > > > > > > > > > be
> >> > > > > > > > > > > > > > > > disabled or isn't always available and it
> >> may
> >> > > also
> >> > > > be
> >> > > > > > > > > expensive
> >> > > > > > > > > > > > (also
> >> > > > > > > > > > > > > > > I've
> >> > > > > > > > > > > > > > > > never used it so not sure if it really
> works
> >> > the
> >> > > > way
> >> > > > > i
> >> > > > > > > > > think).
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > -Jay
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket
> Qin
> >> <
> >> > > > > > > > > > > becket.qin@gmail.com>
> >> > > > > > > > > > > > > > > wrote:
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > If the purpose of the KIP is only to
> >> protect
> >> > > the
> >> > > > > > > cluster
> >> > > > > > > > > from
> >> > > > > > > > > > > > being
> >> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and is not
> >> > > intended
> >> > > > to
> >> > > > > > > > address
> >> > > > > > > > > > > > > resource
> >> > > > > > > > > > > > > > > > > allocation problem among the clients, I
> am
> >> > > > > wondering
> >> > > > > > if
> >> > > > > > > > > using
> >> > > > > > > > > > > > > request
> >> > > > > > > > > > > > > > > > > handling time quota (CPU time quota) is
> a
> >> > > better
> >> > > > > > > option.
> >> > > > > > > > > Here
> >> > > > > > > > > > > are
> >> > > > > > > > > > > > > the
> >> > > > > > > > > > > > > > > > > reasons:
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > 1. request handling time quota has
> better
> >> > > > > protection.
> >> > > > > > > Say
> >> > > > > > > > > we
> >> > > > > > > > > > > have
> >> > > > > > > > > > > > > > > request
> >> > > > > > > > > > > > > > > > > rate quota and set that to some value
> like
> >> > 100
> >> > > > > > > > > requests/sec,
> >> > > > > > > > > > it
> >> > > > > > > > > > > > is
> >> > > > > > > > > > > > > > > > possible
> >> > > > > > > > > > > > > > > > > that some of the requests are very
> >> expensive
> >> > > > > actually
> >> > > > > > > > take
> >> > > > > > > > > a
> >> > > > > > > > > > > lot
> >> > > > > > > > > > > > of
> >> > > > > > > > > > > > > > > time
> >> > > > > > > > > > > > > > > > to
> >> > > > > > > > > > > > > > > > > handle. In that case a few clients may
> >> still
> >> > > > > occupy a
> >> > > > > > > lot
> >> > > > > > > > > of
> >> > > > > > > > > > > CPU
> >> > > > > > > > > > > > > time
> >> > > > > > > > > > > > > > > > even
> >> > > > > > > > > > > > > > > > > the request rate is low. Arguably we can
> >> > > > carefully
> >> > > > > > set
> >> > > > > > > > > > request
> >> > > > > > > > > > > > rate
> >> > > > > > > > > > > > > > > quota
> >> > > > > > > > > > > > > > > > > for each request and client id
> >> combination,
> >> > but
> >> > > > it
> >> > > > > > > could
> >> > > > > > > > > > still
> >> > > > > > > > > > > be
> >> > > > > > > > > > > > > > > tricky
> >> > > > > > > > > > > > > > > > to
> >> > > > > > > > > > > > > > > > > get it right for everyone.
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > If we use the request time handling
> >> quota, we
> >> > > can
> >> > > > > > > simply
> >> > > > > > > > > say
> >> > > > > > > > > > no
> >> > > > > > > > > > > > > > clients
> >> > > > > > > > > > > > > > > > can
> >> > > > > > > > > > > > > > > > > take up to more than 30% of the total
> >> request
> >> > > > > > handling
> >> > > > > > > > > > capacity
> >> > > > > > > > > > > > > > > (measured
> >> > > > > > > > > > > > > > > > > by time), regardless of the difference
> >> among
> >> > > > > > different
> >> > > > > > > > > > requests
> >> > > > > > > > > > > > or
> >> > > > > > > > > > > > > > what
> >> > > > > > > > > > > > > > > > is
> >> > > > > > > > > > > > > > > > > the client doing. In this case maybe we
> >> can
> >> > > quota
> >> > > > > all
> >> > > > > > > the
> >> > > > > > > > > > > > requests
> >> > > > > > > > > > > > > if
> >> > > > > > > > > > > > > > > we
> >> > > > > > > > > > > > > > > > > want to.
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > 2. The main benefit of using request
> rate
> >> > limit
> >> > > > is
> >> > > > > > that
> >> > > > > > > > it
> >> > > > > > > > > > > seems
> >> > > > > > > > > > > > > more
> >> > > > > > > > > > > > > > > > > intuitive. It is true that it is
> probably
> >> > > easier
> >> > > > to
> >> > > > > > > > explain
> >> > > > > > > > > > to
> >> > > > > > > > > > > > the
> >> > > > > > > > > > > > > > user
> >> > > > > > > > > > > > > > > > > what does that mean. However, in
> practice
> >> it
> >> > > > looks
> >> > > > > > the
> >> > > > > > > > > impact
> >> > > > > > > > > > > of
> >> > > > > > > > > > > > > > > request
> >> > > > > > > > > > > > > > > > > rate quota is not more quantifiable than
> >> the
> >> > > > > request
> >> > > > > > > > > handling
> >> > > > > > > > > > > > time
> >> > > > > > > > > > > > > > > quota.
> >> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is still
> >> > > difficult
> >> > > > > to
> >> > > > > > > > give a
> >> > > > > > > > > > > > number
> >> > > > > > > > > > > > > > > about
> >> > > > > > > > > > > > > > > > > impact of throughput or latency when a
> >> > request
> >> > > > rate
> >> > > > > > > quota
> >> > > > > > > > > is
> >> > > > > > > > > > > hit.
> >> > > > > > > > > > > > > So
> >> > > > > > > > > > > > > > it
> >> > > > > > > > > > > > > > > > is
> >> > > > > > > > > > > > > > > > > not better than the request handling
> time
> >> > > quota.
> >> > > > In
> >> > > > > > > fact
> >> > > > > > > > I
> >> > > > > > > > > > feel
> >> > > > > > > > > > > > it
> >> > > > > > > > > > > > > is
> >> > > > > > > > > > > > > > > > > clearer to tell user that "you are
> limited
> >> > > > because
> >> > > > > > you
> >> > > > > > > > have
> >> > > > > > > > > > > taken
> >> > > > > > > > > > > > > 30%
> >> > > > > > > > > > > > > > > of
> >> > > > > > > > > > > > > > > > > the CPU time on the broker" than
> otherwise
> >> > > > > something
> >> > > > > > > like
> >> > > > > > > > > > "your
> >> > > > > > > > > > > > > > request
> >> > > > > > > > > > > > > > > > > rate quota on metadata request has
> >> reached".
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > Thanks,
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay
> >> Kreps <
> >> > > > > > > > > jay@confluent.io
> >> > > > > > > > > > >
> >> > > > > > > > > > > > > wrote:
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > I think this proposal makes a lot of
> >> sense
> >> > > > > > > (especially
> >> > > > > > > > > now
> >> > > > > > > > > > > that
> >> > > > > > > > > > > > > it
> >> > > > > > > > > > > > > > is
> >> > > > > > > > > > > > > > > > > > oriented around request rate) and
> fills
> >> the
> >> > > > > biggest
> >> > > > > > > > > > remaining
> >> > > > > > > > > > > > gap
> >> > > > > > > > > > > > > > in
> >> > > > > > > > > > > > > > > > the
> >> > > > > > > > > > > > > > > > > > multi-tenancy story.
> >> > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > I think for intra-cluster
> communication
> >> > > > > > (StopReplica,
> >> > > > > > > > > etc)
> >> > > > > > > > > > we
> >> > > > > > > > > > > > > could
> >> > > > > > > > > > > > > > > > avoid
> >> > > > > > > > > > > > > > > > > > throttling entirely. You can secure or
> >> > > > otherwise
> >> > > > > > > > > lock-down
> >> > > > > > > > > > > the
> >> > > > > > > > > > > > > > > cluster
> >> > > > > > > > > > > > > > > > > > communication to avoid any
> unauthorized
> >> > > > external
> >> > > > > > > party
> >> > > > > > > > > from
> >> > > > > > > > > > > > > trying
> >> > > > > > > > > > > > > > to
> >> > > > > > > > > > > > > > > > > > initiate these requests. As a result
> we
> >> are
> >> > > as
> >> > > > > > likely
> >> > > > > > > > to
> >> > > > > > > > > > > cause
> >> > > > > > > > > > > > > > > problems
> >> > > > > > > > > > > > > > > > > as
> >> > > > > > > > > > > > > > > > > > solve them by throttling these, right?
> >> > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > I'm not so sure that we should exempt
> >> the
> >> > > > > consumer
> >> > > > > > > > > requests
> >> > > > > > > > > > > > such
> >> > > > > > > > > > > > > as
> >> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we
> >> throttle an
> >> > > > app's
> >> > > > > > > > > heartbeat
> >> > > > > > > > > > > > > > requests
> >> > > > > > > > > > > > > > > it
> >> > > > > > > > > > > > > > > > > may
> >> > > > > > > > > > > > > > > > > > cause it to fall out of its consumer
> >> group.
> >> > > > > However
> >> > > > > > > if
> >> > > > > > > > we
> >> > > > > > > > > > > don't
> >> > > > > > > > > > > > > > > > throttle
> >> > > > > > > > > > > > > > > > > it
> >> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the
> heartbeat
> >> > > > interval
> >> > > > > > is
> >> > > > > > > > set
> >> > > > > > > > > > > > > > incorrectly
> >> > > > > > > > > > > > > > > or
> >> > > > > > > > > > > > > > > > > if
> >> > > > > > > > > > > > > > > > > > some client in some language has a
> bug.
> >> I
> >> > > think
> >> > > > > the
> >> > > > > > > > > policy
> >> > > > > > > > > > > with
> >> > > > > > > > > > > > > > this
> >> > > > > > > > > > > > > > > > kind
> >> > > > > > > > > > > > > > > > > > of throttling is to protect the
> cluster
> >> > above
> >> > > > any
> >> > > > > > > > > > individual
> >> > > > > > > > > > > > app,
> >> > > > > > > > > > > > > > > > right?
> >> > > > > > > > > > > > > > > > > I
> >> > > > > > > > > > > > > > > > > > think in general this should be okay
> >> since
> >> > > for
> >> > > > > most
> >> > > > > > > > > > > deployments
> >> > > > > > > > > > > > > > this
> >> > > > > > > > > > > > > > > > > > setting is meant as more of a safety
> >> > > > valve---that
> >> > > > > > is
> >> > > > > > > > > rather
> >> > > > > > > > > > > > than
> >> > > > > > > > > > > > > > set
> >> > > > > > > > > > > > > > > > > > something very close to what you
> expect
> >> to
> >> > > need
> >> > > > > > (say
> >> > > > > > > 2
> >> > > > > > > > > > > req/sec
> >> > > > > > > > > > > > or
> >> > > > > > > > > > > > > > > > > whatever)
> >> > > > > > > > > > > > > > > > > > you would have something quite high
> >> (like
> >> > 100
> >> > > > > > > req/sec)
> >> > > > > > > > > with
> >> > > > > > > > > > > > this
> >> > > > > > > > > > > > > > > meant
> >> > > > > > > > > > > > > > > > to
> >> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I think
> >> when
> >> > > used
> >> > > > > this
> >> > > > > > > way
> >> > > > > > > > > > > > allowing
> >> > > > > > > > > > > > > > > those
> >> > > > > > > > > > > > > > > > to
> >> > > > > > > > > > > > > > > > > > be throttled would actually provide
> >> > > meaningful
> >> > > > > > > > > protection.
> >> > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > -Jay
> >> > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM,
> Rajini
> >> > > > Sivaram <
> >> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> >> > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > wrote:
> >> > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > > Hi all,
> >> > > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > > I have just created KIP-124 to
> >> introduce
> >> > > > > request
> >> > > > > > > rate
> >> > > > > > > > > > > quotas
> >> > > > > > > > > > > > to
> >> > > > > > > > > > > > > > > > Kafka:
> >> > > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
> >> > > > > > > > confluence/display/KAFKA/KIP-
> >> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> >> > > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > > The proposal is for a simple
> >> percentage
> >> > > > request
> >> > > > > > > > > handling
> >> > > > > > > > > > > time
> >> > > > > > > > > > > > > > quota
> >> > > > > > > > > > > > > > > > > that
> >> > > > > > > > > > > > > > > > > > > can be allocated to *<client-id>*,
> >> > *<user>*
> >> > > > or
> >> > > > > > > > *<user,
> >> > > > > > > > > > > > > > client-id>*.
> >> > > > > > > > > > > > > > > > > There
> >> > > > > > > > > > > > > > > > > > > are a few other suggestions also
> under
> >> > > > > "Rejected
> >> > > > > > > > > > > > alternatives".
> >> > > > > > > > > > > > > > > > > Feedback
> >> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
> >> > > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > > Thank you...
> >> > > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > > Regards,
> >> > > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > > > Rajini
> >> > > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > > > --
> >> > > > > > > > > > > > > > > -- Guozhang
> >> > > > > > > > > > > > > > >
> >> > > > > > > > > > > > > >
> >> > > > > > > > > > > > >
> >> > > > > > > > > > > >
> >> > > > > > > > > > >
> >> > > > > > > > > >
> >> > > > > > > > >
> >> > > > > > > >
> >> > > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > >
> >> > > > > > --
> >> > > > > > -- Guozhang
> >> > > > > >
> >> > > > >
> >> > > >
> >> > >
> >> >
> >>
> >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Dong Lin <li...@gmail.com>.
Hey Rajini,

The current KIP says that the maximum delay will be reduced to window size
if it is larger than the window size. I have a concern with this:

1) This essentially means that the user is allowed to exceed their quota
over a long period of time. Can you provide an upper bound on this
deviation?

2) What is the motivation for cap the maximum delay by the window size? I
am wondering if there is better alternative to address the problem.

3) It means that the existing metric-related config will have a more
directly impact on the mechanism of this io-thread-unit-based quota. The
may be an important change depending on the answer to 1) above. We probably
need to document this more explicitly.

Dong


On Thu, Feb 23, 2017 at 10:56 AM, Dong Lin <li...@gmail.com> wrote:

> Hey Jun,
>
> Yeah you are right. I thought it wasn't because at LinkedIn it will be too
> much pressure on inGraph to expose those per-clientId metrics so we ended
> up printing them periodically to local log. Never mind if it is not a
> general problem.
>
> Hey Rajini,
>
> - I agree with Jay that we probably don't want to add a new field for
> every quota ProduceResponse or FetchResponse. Is there any use-case for
> having separate throttle-time fields for byte-rate-quota and
> io-thread-unit-quota? You probably need to document this as interface
> change if you plan to add new field in any request.
>
> - I don't think IOThread belongs to quotaType. The existing quota types
> (i.e. Produce/Fetch/LeaderReplication/FollowerReplication) identify the
> type of request that are throttled, not the quota mechanism that is applied.
>
> - If a request is throttled due to this io-thread-unit-based quota, is the
> existing queue-size metric in ClientQuotaManager incremented?
>
> - In the interest of providing guide line for admin to decide
> io-thread-unit-based quota and for user to understand its impact on their
> traffic, would it be useful to have a metric that shows the overall
> byte-rate per io-thread-unit? Can we also show this a per-clientId metric?
>
> Thanks,
> Dong
>
>
> On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io> wrote:
>
>> Hi, Ismael,
>>
>> For #3, typically, an admin won't configure more io threads than CPU
>> cores,
>> but it's possible for an admin to start with fewer io threads than cores
>> and grow that later on.
>>
>> Hi, Dong,
>>
>> I think the throttleTime sensor on the broker tells the admin whether a
>> user/clentId is throttled or not.
>>
>> Hi, Radi,
>>
>> The reasoning for delaying the throttled requests on the broker instead of
>> returning an error immediately is that the latter has no way to prevent
>> the
>> client from retrying immediately, which will make things worse. The
>> delaying logic is based off a delay queue. A separate expiration thread
>> just waits on the next to be expired request. So, it doesn't tie up a
>> request handler thread.
>>
>> Thanks,
>>
>> Jun
>>
>> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <is...@juma.me.uk> wrote:
>>
>> > Hi Jay,
>> >
>> > Regarding 1, I definitely like the simplicity of keeping a single
>> throttle
>> > time field in the response. The downside is that the client metrics
>> will be
>> > more coarse grained.
>> >
>> > Regarding 3, we have `leader.imbalance.per.broker.percentage` and
>> > `log.cleaner.min.cleanable.ratio`.
>> >
>> > Ismael
>> >
>> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io> wrote:
>> >
>> > > A few minor comments:
>> > >
>> > >    1. Isn't it the case that the throttling time response field should
>> > have
>> > >    the total time your request was throttled irrespective of the
>> quotas
>> > > that
>> > >    caused that. Limiting it to byte rate quota doesn't make sense,
>> but I
>> > > also
>> > >    I don't think we want to end up adding new fields in the response
>> for
>> > > every
>> > >    single thing we quota, right?
>> > >    2. I don't think we should make this quota specifically about io
>> > >    threads. Once we introduce these quotas people set them and expect
>> > them
>> > > to
>> > >    be enforced (and if they aren't it may cause an outage). As a
>> result
>> > > they
>> > >    are a bit more sensitive than normal configs, I think. The current
>> > > thread
>> > >    pools seem like something of an implementation detail and not the
>> > level
>> > > the
>> > >    user-facing quotas should be involved with. I think it might be
>> better
>> > > to
>> > >    make this a general request-time throttle with no mention in the
>> > naming
>> > >    about I/O threads and simply acknowledge the current limitation
>> (which
>> > > we
>> > >    may someday fix) in the docs that this covers only the time after
>> the
>> > >    thread is read off the network.
>> > >    3. As such I think the right interface to the user would be
>> something
>> > >    like percent_request_time and be in {0,...100} or
>> request_time_ratio
>> > > and be
>> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we used if the
>> > > scale
>> > >    is between 0 and 1 in the other metrics, right?)
>> > >
>> > > -Jay
>> > >
>> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
>> rajinisivaram@gmail.com
>> > >
>> > > wrote:
>> > >
>> > > > Guozhang/Dong,
>> > > >
>> > > > Thank you for the feedback.
>> > > >
>> > > > Guozhang : I have updated the section on co-existence of byte rate
>> and
>> > > > request time quotas.
>> > > >
>> > > > Dong: I hadn't added much detail to the metrics and sensors since
>> they
>> > > are
>> > > > going to be very similar to the existing metrics and sensors. To
>> avoid
>> > > > confusion, I have now added more detail. All metrics are in the
>> group
>> > > > "quotaType" and all sensors have names starting with "quotaType"
>> (where
>> > > > quotaType is Produce/Fetch/LeaderReplication/
>> > > > FollowerReplication/*IOThread*).
>> > > > So there will be no reuse of existing metrics/sensors. The new ones
>> for
>> > > > request processing time based throttling will be completely
>> independent
>> > > of
>> > > > existing metrics/sensors, but will be consistent in format.
>> > > >
>> > > > The existing throttle_time_ms field in produce/fetch responses will
>> not
>> > > be
>> > > > impacted by this KIP. That will continue to return byte-rate based
>> > > > throttling times. In addition, a new field request_throttle_time_ms
>> > will
>> > > be
>> > > > added to return request quota based throttling times. These will be
>> > > exposed
>> > > > as new metrics on the client-side.
>> > > >
>> > > > Since all metrics and sensors are different for each type of quota,
>> I
>> > > > believe there is already sufficient metrics to monitor throttling on
>> > both
>> > > > client and broker side for each type of throttling.
>> > > >
>> > > > Regards,
>> > > >
>> > > > Rajini
>> > > >
>> > > >
>> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <li...@gmail.com>
>> wrote:
>> > > >
>> > > > > Hey Rajini,
>> > > > >
>> > > > > I think it makes a lot of sense to use io_thread_units as metric
>> to
>> > > quota
>> > > > > user's traffic here. LGTM overall. I have some questions regarding
>> > > > sensors.
>> > > > >
>> > > > > - Can you be more specific in the KIP what sensors will be added?
>> For
>> > > > > example, it will be useful to specify the name and attributes of
>> > these
>> > > > new
>> > > > > sensors.
>> > > > >
>> > > > > - We currently have throttle-time and queue-size for byte-rate
>> based
>> > > > quota.
>> > > > > Are you going to have separate throttle-time and queue-size for
>> > > requests
>> > > > > throttled by io_thread_unit-based quota, or will they share the
>> same
>> > > > > sensor?
>> > > > >
>> > > > > - Does the throttle-time in the ProduceResponse and FetchResponse
>> > > > contains
>> > > > > time due to io_thread_unit-based quota?
>> > > > >
>> > > > > - Currently kafka server doesn't not provide any log or metrics
>> that
>> > > > tells
>> > > > > whether any given clientId (or user) is throttled. This is not too
>> > bad
>> > > > > because we can still check the client-side byte-rate metric to
>> > validate
>> > > > > whether a given client is throttled. But with this io_thread_unit,
>> > > there
>> > > > > will be no way to validate whether a given client is slow because
>> it
>> > > has
>> > > > > exceeded its io_thread_unit limit. It is necessary for user to be
>> > able
>> > > to
>> > > > > know this information to figure how whether they have reached
>> there
>> > > quota
>> > > > > limit. How about we add log4j log on the server side to
>> periodically
>> > > > print
>> > > > > the (client_id, byte-rate-throttle-time,
>> > io-thread-unit-throttle-time)
>> > > so
>> > > > > that kafka administrator can figure those users that have reached
>> > their
>> > > > > limit and act accordingly?
>> > > > >
>> > > > > Thanks,
>> > > > > Dong
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > >
>> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <
>> wangguoz@gmail.com>
>> > > > wrote:
>> > > > >
>> > > > > > Made a pass over the doc, overall LGTM except a minor comment on
>> > the
>> > > > > > throttling implementation:
>> > > > > >
>> > > > > > Stated as "Request processing time throttling will be applied on
>> > top
>> > > if
>> > > > > > necessary." I thought that it meant the request processing time
>> > > > > throttling
>> > > > > > is applied first, but continue reading I found it actually
>> meant to
>> > > > apply
>> > > > > > produce / fetch byte rate throttling first.
>> > > > > >
>> > > > > > Also the last sentence "The remaining delay if any is applied to
>> > the
>> > > > > > response." is a bit confusing to me. Maybe rewording it a bit?
>> > > > > >
>> > > > > >
>> > > > > > Guozhang
>> > > > > >
>> > > > > >
>> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <ju...@confluent.io>
>> wrote:
>> > > > > >
>> > > > > > > Hi, Rajini,
>> > > > > > >
>> > > > > > > Thanks for the updated KIP. The latest proposal looks good to
>> me.
>> > > > > > >
>> > > > > > > Jun
>> > > > > > >
>> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
>> > > > > rajinisivaram@gmail.com
>> > > > > > >
>> > > > > > > wrote:
>> > > > > > >
>> > > > > > > > Jun/Roger,
>> > > > > > > >
>> > > > > > > > Thank you for the feedback.
>> > > > > > > >
>> > > > > > > > 1. I have updated the KIP to use absolute units instead of
>> > > > > percentage.
>> > > > > > > The
>> > > > > > > > property is called* io_thread_units* to align with the
>> thread
>> > > count
>> > > > > > > > property *num.io.threads*. When we implement network thread
>> > > > > utilization
>> > > > > > > > quotas, we can add another property *network_thread_units.*
>> > > > > > > >
>> > > > > > > > 2. ControlledShutdown is already listed under the exempt
>> > > requests.
>> > > > > Jun,
>> > > > > > > did
>> > > > > > > > you mean a different request that needs to be added? The
>> four
>> > > > > requests
>> > > > > > > > currently exempt in the KIP are StopReplica,
>> > ControlledShutdown,
>> > > > > > > > LeaderAndIsr and UpdateMetadata. These are controlled using
>> > > > > > ClusterAction
>> > > > > > > > ACL, so it is easy to exclude and only throttle if
>> > unauthorized.
>> > > I
>> > > > > > wasn't
>> > > > > > > > sure if there are other requests used only for inter-broker
>> > that
>> > > > > needed
>> > > > > > > to
>> > > > > > > > be excluded.
>> > > > > > > >
>> > > > > > > > 3. I was thinking the smallest change would be to replace
>> all
>> > > > > > references
>> > > > > > > to
>> > > > > > > > *requestChannel.sendResponse()* with a local method
>> > > > > > > > *sendResponseMaybeThrottle()* that does the throttling if
>> any
>> > > plus
>> > > > > send
>> > > > > > > > response. If we throttle first in *KafkaApis.handle()*, the
>> > time
>> > > > > spent
>> > > > > > > > within the method handling the request will not be recorded
>> or
>> > > used
>> > > > > in
>> > > > > > > > throttling. We can look into this again when the PR is ready
>> > for
>> > > > > > review.
>> > > > > > > >
>> > > > > > > > Regards,
>> > > > > > > >
>> > > > > > > > Rajini
>> > > > > > > >
>> > > > > > > >
>> > > > > > > >
>> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
>> > > > > roger.hoover@gmail.com>
>> > > > > > > > wrote:
>> > > > > > > >
>> > > > > > > > > Great to see this KIP and the excellent discussion.
>> > > > > > > > >
>> > > > > > > > > To me, Jun's suggestion makes sense.  If my application is
>> > > > > allocated
>> > > > > > 1
>> > > > > > > > > request handler unit, then it's as if I have a Kafka
>> broker
>> > > with
>> > > > a
>> > > > > > > single
>> > > > > > > > > request handler thread dedicated to me.  That's the most I
>> > can
>> > > > use,
>> > > > > > at
>> > > > > > > > > least.  That allocation doesn't change even if an admin
>> later
>> > > > > > increases
>> > > > > > > > the
>> > > > > > > > > size of the request thread pool on the broker.  It's
>> similar
>> > to
>> > > > the
>> > > > > > CPU
>> > > > > > > > > abstraction that VMs and containers get from hypervisors
>> or
>> > OS
>> > > > > > > > schedulers.
>> > > > > > > > > While different client access patterns can use wildly
>> > different
>> > > > > > amounts
>> > > > > > > > of
>> > > > > > > > > request thread resources per request, a given application
>> > will
>> > > > > > > generally
>> > > > > > > > > have a stable access pattern and can figure out
>> empirically
>> > how
>> > > > > many
>> > > > > > > > > "request thread units" it needs to meet it's
>> > throughput/latency
>> > > > > > goals.
>> > > > > > > > >
>> > > > > > > > > Cheers,
>> > > > > > > > >
>> > > > > > > > > Roger
>> > > > > > > > >
>> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <
>> jun@confluent.io>
>> > > > wrote:
>> > > > > > > > >
>> > > > > > > > > > Hi, Rajini,
>> > > > > > > > > >
>> > > > > > > > > > Thanks for the updated KIP. A few more comments.
>> > > > > > > > > >
>> > > > > > > > > > 1. A concern of request_time_percent is that it's not an
>> > > > absolute
>> > > > > > > > value.
>> > > > > > > > > > Let's say you give a user a 10% limit. If the admin
>> doubles
>> > > the
>> > > > > > > number
>> > > > > > > > of
>> > > > > > > > > > request handler threads, that user now actually has
>> twice
>> > the
>> > > > > > > absolute
>> > > > > > > > > > capacity. This may confuse people a bit. So, perhaps
>> > setting
>> > > > the
>> > > > > > > quota
>> > > > > > > > > > based on an absolute request thread unit is better.
>> > > > > > > > > >
>> > > > > > > > > > 2. ControlledShutdownRequest is also an inter-broker
>> > request
>> > > > and
>> > > > > > > needs
>> > > > > > > > to
>> > > > > > > > > > be excluded from throttling.
>> > > > > > > > > >
>> > > > > > > > > > 3. Implementation wise, I am wondering if it's simpler
>> to
>> > > apply
>> > > > > the
>> > > > > > > > > request
>> > > > > > > > > > time throttling first in KafkaApis.handle(). Otherwise,
>> we
>> > > will
>> > > > > > need
>> > > > > > > to
>> > > > > > > > > add
>> > > > > > > > > > the throttling logic in each type of request.
>> > > > > > > > > >
>> > > > > > > > > > Thanks,
>> > > > > > > > > >
>> > > > > > > > > > Jun
>> > > > > > > > > >
>> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
>> > > > > > > > rajinisivaram@gmail.com
>> > > > > > > > > >
>> > > > > > > > > > wrote:
>> > > > > > > > > >
>> > > > > > > > > > > Jun,
>> > > > > > > > > > >
>> > > > > > > > > > > Thank you for the review.
>> > > > > > > > > > >
>> > > > > > > > > > > I have reverted to the original KIP that throttles
>> based
>> > on
>> > > > > > request
>> > > > > > > > > > handler
>> > > > > > > > > > > utilization. At the moment, it uses percentage, but I
>> am
>> > > > happy
>> > > > > to
>> > > > > > > > > change
>> > > > > > > > > > to
>> > > > > > > > > > > a fraction (out of 1 instead of 100) if required. I
>> have
>> > > > added
>> > > > > > the
>> > > > > > > > > > examples
>> > > > > > > > > > > from this discussion to the KIP. Also added a "Future
>> > Work"
>> > > > > > section
>> > > > > > > > to
>> > > > > > > > > > > address network thread utilization. The configuration
>> is
>> > > > named
>> > > > > > > > > > > "request_time_percent" with the expectation that it
>> can
>> > > also
>> > > > be
>> > > > > > > used
>> > > > > > > > as
>> > > > > > > > > > the
>> > > > > > > > > > > limit for network thread utilization when that is
>> > > > implemented,
>> > > > > so
>> > > > > > > > that
>> > > > > > > > > > > users have to set only one config for the two and not
>> > have
>> > > to
>> > > > > > worry
>> > > > > > > > > about
>> > > > > > > > > > > the internal distribution of the work between the two
>> > > thread
>> > > > > > pools
>> > > > > > > in
>> > > > > > > > > > > Kafka.
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > Regards,
>> > > > > > > > > > >
>> > > > > > > > > > > Rajini
>> > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
>> > > jun@confluent.io>
>> > > > > > > wrote:
>> > > > > > > > > > >
>> > > > > > > > > > > > Hi, Rajini,
>> > > > > > > > > > > >
>> > > > > > > > > > > > Thanks for the proposal.
>> > > > > > > > > > > >
>> > > > > > > > > > > > The benefit of using the request processing time
>> over
>> > the
>> > > > > > request
>> > > > > > > > > rate
>> > > > > > > > > > is
>> > > > > > > > > > > > exactly what people have said. I will just expand
>> that
>> > a
>> > > > bit.
>> > > > > > > > > Consider
>> > > > > > > > > > > the
>> > > > > > > > > > > > following case. The producer sends a produce request
>> > > with a
>> > > > > > 10MB
>> > > > > > > > > > message
>> > > > > > > > > > > > but compressed to 100KB with gzip. The
>> decompression of
>> > > the
>> > > > > > > message
>> > > > > > > > > on
>> > > > > > > > > > > the
>> > > > > > > > > > > > broker could take 10-15 seconds, during which time,
>> a
>> > > > request
>> > > > > > > > handler
>> > > > > > > > > > > > thread is completely blocked. In this case, neither
>> the
>> > > > > byte-in
>> > > > > > > > quota
>> > > > > > > > > > nor
>> > > > > > > > > > > > the request rate quota may be effective in
>> protecting
>> > the
>> > > > > > broker.
>> > > > > > > > > > > Consider
>> > > > > > > > > > > > another case. A consumer group starts with 10
>> instances
>> > > and
>> > > > > > later
>> > > > > > > > on
>> > > > > > > > > > > > switches to 20 instances. The request rate will
>> likely
>> > > > > double,
>> > > > > > > but
>> > > > > > > > > the
>> > > > > > > > > > > > actually load on the broker may not double since
>> each
>> > > fetch
>> > > > > > > request
>> > > > > > > > > > only
>> > > > > > > > > > > > contains half of the partitions. Request rate quota
>> may
>> > > not
>> > > > > be
>> > > > > > > easy
>> > > > > > > > > to
>> > > > > > > > > > > > configure in this case.
>> > > > > > > > > > > >
>> > > > > > > > > > > > What we really want is to be able to prevent a
>> client
>> > > from
>> > > > > > using
>> > > > > > > > too
>> > > > > > > > > > much
>> > > > > > > > > > > > of the server side resources. In this particular
>> KIP,
>> > > this
>> > > > > > > resource
>> > > > > > > > > is
>> > > > > > > > > > > the
>> > > > > > > > > > > > capacity of the request handler threads. I agree
>> that
>> > it
>> > > > may
>> > > > > > not
>> > > > > > > be
>> > > > > > > > > > > > intuitive for the users to determine how to set the
>> > right
>> > > > > > limit.
>> > > > > > > > > > However,
>> > > > > > > > > > > > this is not completely new and has been done in the
>> > > > container
>> > > > > > > world
>> > > > > > > > > > > > already. For example, Linux cgroup (
>> > > > > https://access.redhat.com/
>> > > > > > > > > > > > documentation/en-US/Red_Hat_En
>> terprise_Linux/6/html/
>> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html) has the
>> > concept
>> > > of
>> > > > > > > > > > > > cpu.cfs_quota_us,
>> > > > > > > > > > > > which specifies the total amount of time in
>> > microseconds
>> > > > for
>> > > > > > > which
>> > > > > > > > > all
>> > > > > > > > > > > > tasks in a cgroup can run during a one second
>> period.
>> > We
>> > > > can
>> > > > > > > > > > potentially
>> > > > > > > > > > > > model the request handler threads in a similar way.
>> For
>> > > > > > example,
>> > > > > > > > each
>> > > > > > > > > > > > request handler thread can be 1 request handler unit
>> > and
>> > > > the
>> > > > > > > admin
>> > > > > > > > > can
>> > > > > > > > > > > > configure a limit on how many units (say 0.01) a
>> client
>> > > can
>> > > > > > have.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Regarding not throttling the internal broker to
>> broker
>> > > > > > requests.
>> > > > > > > We
>> > > > > > > > > > could
>> > > > > > > > > > > > do that. Alternatively, we could just let the admin
>> > > > > configure a
>> > > > > > > > high
>> > > > > > > > > > > limit
>> > > > > > > > > > > > for the kafka user (it may not be able to do that
>> > easily
>> > > > > based
>> > > > > > on
>> > > > > > > > > > > clientId
>> > > > > > > > > > > > though).
>> > > > > > > > > > > >
>> > > > > > > > > > > > Ideally we want to be able to protect the
>> utilization
>> > of
>> > > > the
>> > > > > > > > network
>> > > > > > > > > > > thread
>> > > > > > > > > > > > pool too. The difficult is mostly what Rajini said:
>> (1)
>> > > The
>> > > > > > > > mechanism
>> > > > > > > > > > for
>> > > > > > > > > > > > throttling the requests is through Purgatory and we
>> > will
>> > > > have
>> > > > > > to
>> > > > > > > > > think
>> > > > > > > > > > > > through how to integrate that into the network
>> layer.
>> > > (2)
>> > > > In
>> > > > > > the
>> > > > > > > > > > network
>> > > > > > > > > > > > layer, currently we know the user, but not the
>> clientId
>> > > of
>> > > > > the
>> > > > > > > > > request.
>> > > > > > > > > > > So,
>> > > > > > > > > > > > it's a bit tricky to throttle based on clientId
>> there.
>> > > > Plus,
>> > > > > > the
>> > > > > > > > > > byteOut
>> > > > > > > > > > > > quota can already protect the network thread
>> > utilization
>> > > > for
>> > > > > > > fetch
>> > > > > > > > > > > > requests. So, if we can't figure out this part right
>> > now,
>> > > > > just
>> > > > > > > > > focusing
>> > > > > > > > > > > on
>> > > > > > > > > > > > the request handling threads for this KIP is still a
>> > > useful
>> > > > > > > > feature.
>> > > > > > > > > > > >
>> > > > > > > > > > > > Thanks,
>> > > > > > > > > > > >
>> > > > > > > > > > > > Jun
>> > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
>> > > > > > > > > > rajinisivaram@gmail.com
>> > > > > > > > > > > >
>> > > > > > > > > > > > wrote:
>> > > > > > > > > > > >
>> > > > > > > > > > > > > Thank you all for the feedback.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Jay: I have removed exemption for consumer
>> heartbeat
>> > > etc.
>> > > > > > Agree
>> > > > > > > > > that
>> > > > > > > > > > > > > protecting the cluster is more important than
>> > > protecting
>> > > > > > > > individual
>> > > > > > > > > > > apps.
>> > > > > > > > > > > > > Have retained the exemption for
>> > > StopReplicat/LeaderAndIsr
>> > > > > > etc,
>> > > > > > > > > these
>> > > > > > > > > > > are
>> > > > > > > > > > > > > throttled only if authorization fails (so can't be
>> > used
>> > > > for
>> > > > > > DoS
>> > > > > > > > > > attacks
>> > > > > > > > > > > > in
>> > > > > > > > > > > > > a secure cluster, but allows inter-broker
>> requests to
>> > > > > > complete
>> > > > > > > > > > without
>> > > > > > > > > > > > > delays).
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > I will wait another day to see if these is any
>> > > objection
>> > > > to
>> > > > > > > > quotas
>> > > > > > > > > > > based
>> > > > > > > > > > > > on
>> > > > > > > > > > > > > request processing time (as opposed to request
>> rate)
>> > > and
>> > > > if
>> > > > > > > there
>> > > > > > > > > are
>> > > > > > > > > > > no
>> > > > > > > > > > > > > objections, I will revert to the original proposal
>> > with
>> > > > > some
>> > > > > > > > > changes.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > The original proposal was only including the time
>> > used
>> > > by
>> > > > > the
>> > > > > > > > > request
>> > > > > > > > > > > > > handler threads (that made calculation easy). I
>> think
>> > > the
>> > > > > > > > > suggestion
>> > > > > > > > > > is
>> > > > > > > > > > > > to
>> > > > > > > > > > > > > include the time spent in the network threads as
>> well
>> > > > since
>> > > > > > > that
>> > > > > > > > > may
>> > > > > > > > > > be
>> > > > > > > > > > > > > significant. As Jay pointed out, it is more
>> > complicated
>> > > > to
>> > > > > > > > > calculate
>> > > > > > > > > > > the
>> > > > > > > > > > > > > total available CPU time and convert to a ratio
>> when
>> > > > there
>> > > > > > *m*
>> > > > > > > > I/O
>> > > > > > > > > > > > threads
>> > > > > > > > > > > > > and *n* network threads.
>> > ThreadMXBean#getThreadCPUTime(
>> > > )
>> > > > > may
>> > > > > > > > give
>> > > > > > > > > us
>> > > > > > > > > > > > what
>> > > > > > > > > > > > > we want, but it can be very expensive on some
>> > > platforms.
>> > > > As
>> > > > > > > > Becket
>> > > > > > > > > > and
>> > > > > > > > > > > > > Guozhang have pointed out, we do have several time
>> > > > > > measurements
>> > > > > > > > > > already
>> > > > > > > > > > > > for
>> > > > > > > > > > > > > generating metrics that we could use, though we
>> might
>> > > > want
>> > > > > to
>> > > > > > > > > switch
>> > > > > > > > > > to
>> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis() since
>> some
>> > of
>> > > > the
>> > > > > > > > values
>> > > > > > > > > > for
>> > > > > > > > > > > > > small requests may be < 1ms. But rather than add
>> up
>> > the
>> > > > > time
>> > > > > > > > spent
>> > > > > > > > > in
>> > > > > > > > > > > I/O
>> > > > > > > > > > > > > thread and network thread, wouldn't it be better
>> to
>> > > > convert
>> > > > > > the
>> > > > > > > > > time
>> > > > > > > > > > > > spent
>> > > > > > > > > > > > > on each thread into a separate ratio? UserA has a
>> > > request
>> > > > > > quota
>> > > > > > > > of
>> > > > > > > > > > 5%.
>> > > > > > > > > > > > Can
>> > > > > > > > > > > > > we take that to mean that UserA can use 5% of the
>> > time
>> > > on
>> > > > > > > network
>> > > > > > > > > > > threads
>> > > > > > > > > > > > > and 5% of the time on I/O threads? If either is
>> > > exceeded,
>> > > > > the
>> > > > > > > > > > response
>> > > > > > > > > > > is
>> > > > > > > > > > > > > throttled - it would mean maintaining two sets of
>> > > metrics
>> > > > > for
>> > > > > > > the
>> > > > > > > > > two
>> > > > > > > > > > > > > durations, but would result in more meaningful
>> > ratios.
>> > > We
>> > > > > > could
>> > > > > > > > > > define
>> > > > > > > > > > > > two
>> > > > > > > > > > > > > quota limits (UserA has 5% of request threads and
>> 10%
>> > > of
>> > > > > > > network
>> > > > > > > > > > > > threads),
>> > > > > > > > > > > > > but that seems unnecessary and harder to explain
>> to
>> > > > users.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Back to why and how quotas are applied to network
>> > > thread
>> > > > > > > > > utilization:
>> > > > > > > > > > > > > a) In the case of fetch,  the time spent in the
>> > network
>> > > > > > thread
>> > > > > > > > may
>> > > > > > > > > be
>> > > > > > > > > > > > > significant and I can see the need to include
>> this.
>> > Are
>> > > > > there
>> > > > > > > > other
>> > > > > > > > > > > > > requests where the network thread utilization is
>> > > > > significant?
>> > > > > > > In
>> > > > > > > > > the
>> > > > > > > > > > > case
>> > > > > > > > > > > > > of fetch, request handler thread utilization would
>> > > > throttle
>> > > > > > > > clients
>> > > > > > > > > > > with
>> > > > > > > > > > > > > high request rate, low data volume and fetch byte
>> > rate
>> > > > > quota
>> > > > > > > will
>> > > > > > > > > > > > throttle
>> > > > > > > > > > > > > clients with high data volume. Network thread
>> > > utilization
>> > > > > is
>> > > > > > > > > perhaps
>> > > > > > > > > > > > > proportional to the data volume. I am wondering
>> if we
>> > > > even
>> > > > > > need
>> > > > > > > > to
>> > > > > > > > > > > > throttle
>> > > > > > > > > > > > > based on network thread utilization or whether the
>> > data
>> > > > > > volume
>> > > > > > > > > quota
>> > > > > > > > > > > > covers
>> > > > > > > > > > > > > this case.
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > b) At the moment, we record and check for quota
>> > > violation
>> > > > > at
>> > > > > > > the
>> > > > > > > > > same
>> > > > > > > > > > > > time.
>> > > > > > > > > > > > > If a quota is violated, the response is delayed.
>> > Using
>> > > > > Jay'e
>> > > > > > > > > example
>> > > > > > > > > > of
>> > > > > > > > > > > > > disk reads for fetches happening in the network
>> > thread,
>> > > > We
>> > > > > > > can't
>> > > > > > > > > > record
>> > > > > > > > > > > > and
>> > > > > > > > > > > > > delay a response after the disk reads. We could
>> > record
>> > > > the
>> > > > > > time
>> > > > > > > > > spent
>> > > > > > > > > > > on
>> > > > > > > > > > > > > the network thread when the response is complete
>> and
>> > > > > > introduce
>> > > > > > > a
>> > > > > > > > > > delay
>> > > > > > > > > > > > for
>> > > > > > > > > > > > > handling a subsequent request (separate out
>> recording
>> > > and
>> > > > > > quota
>> > > > > > > > > > > violation
>> > > > > > > > > > > > > handling in the case of network thread overload).
>> > Does
>> > > > that
>> > > > > > > make
>> > > > > > > > > > sense?
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > Rajini
>> > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
>> > > > > > > > becket.qin@gmail.com>
>> > > > > > > > > > > > wrote:
>> > > > > > > > > > > > >
>> > > > > > > > > > > > > > Hey Jay,
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU time is a
>> > little
>> > > > > > > tricky. I
>> > > > > > > > > am
>> > > > > > > > > > > > > thinking
>> > > > > > > > > > > > > > that maybe we can use the existing request
>> > > statistics.
>> > > > > They
>> > > > > > > are
>> > > > > > > > > > > already
>> > > > > > > > > > > > > > very detailed so we can probably see the
>> > approximate
>> > > > CPU
>> > > > > > time
>> > > > > > > > > from
>> > > > > > > > > > > it,
>> > > > > > > > > > > > > e.g.
>> > > > > > > > > > > > > > something like (total_time -
>> > > > request/response_queue_time
>> > > > > -
>> > > > > > > > > > > > remote_time).
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > I agree with Guozhang that when a user is
>> throttled
>> > > it
>> > > > is
>> > > > > > > > likely
>> > > > > > > > > > that
>> > > > > > > > > > > > we
>> > > > > > > > > > > > > > need to see if anything has went wrong first,
>> and
>> > if
>> > > > the
>> > > > > > > users
>> > > > > > > > > are
>> > > > > > > > > > > well
>> > > > > > > > > > > > > > behaving and just need more resources, we will
>> have
>> > > to
>> > > > > bump
>> > > > > > > up
>> > > > > > > > > the
>> > > > > > > > > > > > quota
>> > > > > > > > > > > > > > for them. It is true that pre-allocating CPU
>> time
>> > > quota
>> > > > > > > > precisely
>> > > > > > > > > > for
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > users is difficult. So in practice it would
>> > probably
>> > > be
>> > > > > > more
>> > > > > > > > like
>> > > > > > > > > > > first
>> > > > > > > > > > > > > set
>> > > > > > > > > > > > > > a relative high protective CPU time quota for
>> > > everyone
>> > > > > and
>> > > > > > > > > increase
>> > > > > > > > > > > > that
>> > > > > > > > > > > > > > for some individual clients on demand.
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Thanks,
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > Jiangjie (Becket) Qin
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <
>> > > > > > > > > wangguoz@gmail.com
>> > > > > > > > > > >
>> > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > This is a great proposal, glad to see it
>> > happening.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > I am inclined to the CPU throttling, or more
>> > > > > specifically
>> > > > > > > > > > > processing
>> > > > > > > > > > > > > time
>> > > > > > > > > > > > > > > ratio instead of the request rate throttling
>> as
>> > > well.
>> > > > > > > Becket
>> > > > > > > > > has
>> > > > > > > > > > > very
>> > > > > > > > > > > > > > well
>> > > > > > > > > > > > > > > summed my rationales above, and one thing to
>> add
>> > > here
>> > > > > is
>> > > > > > > that
>> > > > > > > > > the
>> > > > > > > > > > > > > former
>> > > > > > > > > > > > > > > has a good support for both "protecting
>> against
>> > > rogue
>> > > > > > > > clients"
>> > > > > > > > > as
>> > > > > > > > > > > > well
>> > > > > > > > > > > > > as
>> > > > > > > > > > > > > > > "utilizing a cluster for multi-tenancy usage":
>> > when
>> > > > > > > thinking
>> > > > > > > > > > about
>> > > > > > > > > > > > how
>> > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > explain this to the end users, I find it
>> actually
>> > > > more
>> > > > > > > > natural
>> > > > > > > > > > than
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > request rate since as mentioned above,
>> different
>> > > > > requests
>> > > > > > > > will
>> > > > > > > > > > have
>> > > > > > > > > > > > > quite
>> > > > > > > > > > > > > > > different "cost", and Kafka today already have
>> > > > various
>> > > > > > > > request
>> > > > > > > > > > > types
>> > > > > > > > > > > > > > > (produce, fetch, admin, metadata, etc),
>> because
>> > of
>> > > > that
>> > > > > > the
>> > > > > > > > > > request
>> > > > > > > > > > > > > rate
>> > > > > > > > > > > > > > > throttling may not be as effective unless it
>> is
>> > set
>> > > > > very
>> > > > > > > > > > > > > conservatively.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Regarding to user reactions when they are
>> > > throttled,
>> > > > I
>> > > > > > > think
>> > > > > > > > it
>> > > > > > > > > > may
>> > > > > > > > > > > > > > differ
>> > > > > > > > > > > > > > > case-by-case, and need to be discovered /
>> guided
>> > by
>> > > > > > looking
>> > > > > > > > at
>> > > > > > > > > > > > relative
>> > > > > > > > > > > > > > > metrics. So in other words users would not
>> expect
>> > > to
>> > > > > get
>> > > > > > > > > > additional
>> > > > > > > > > > > > > > > information by simply being told "hey, you are
>> > > > > > throttled",
>> > > > > > > > > which
>> > > > > > > > > > is
>> > > > > > > > > > > > all
>> > > > > > > > > > > > > > > what throttling does; they need to take a
>> > follow-up
>> > > > > step
>> > > > > > > and
>> > > > > > > > > see
>> > > > > > > > > > > > "hmm,
>> > > > > > > > > > > > > > I'm
>> > > > > > > > > > > > > > > throttled probably because of ..", which is by
>> > > > looking
>> > > > > at
>> > > > > > > > other
>> > > > > > > > > > > > metric
>> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding the
>> brokers
>> > > with
>> > > > > > > metadata
>> > > > > > > > > > > > request,
>> > > > > > > > > > > > > > > which are usually cheap to handle but I'm
>> sending
>> > > > > > thousands
>> > > > > > > > per
>> > > > > > > > > > > > second;
>> > > > > > > > > > > > > > or
>> > > > > > > > > > > > > > > is it because I'm catching up and hence
>> sending
>> > > very
>> > > > > > heavy
>> > > > > > > > > > fetching
>> > > > > > > > > > > > > > request
>> > > > > > > > > > > > > > > with large min.bytes, etc.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Regarding to the implementation, as once
>> > discussed
>> > > > with
>> > > > > > > Jun,
>> > > > > > > > > this
>> > > > > > > > > > > > seems
>> > > > > > > > > > > > > > not
>> > > > > > > > > > > > > > > very difficult since today we are already
>> > > collecting
>> > > > > the
>> > > > > > > > > "thread
>> > > > > > > > > > > pool
>> > > > > > > > > > > > > > > utilization" metrics, which is a single
>> > percentage
>> > > > > > > > > > > > "aggregateIdleMeter"
>> > > > > > > > > > > > > > > value; but we are already effectively
>> aggregating
>> > > it
>> > > > > for
>> > > > > > > each
>> > > > > > > > > > > > requests
>> > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just extend
>> it by
>> > > > > > recording
>> > > > > > > > the
>> > > > > > > > > > > > source
>> > > > > > > > > > > > > > > client id when handling them and aggregating
>> by
>> > > > > clientId
>> > > > > > as
>> > > > > > > > > well
>> > > > > > > > > > as
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > > total aggregate.
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > Guozhang
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <
>> > > > > > > jay@confluent.io
>> > > > > > > > >
>> > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Hey Becket/Rajini,
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > When I thought about it more deeply I came
>> > around
>> > > > to
>> > > > > > the
>> > > > > > > > > > "percent
>> > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > processing time" metric too. It seems a lot
>> > > closer
>> > > > to
>> > > > > > the
>> > > > > > > > > thing
>> > > > > > > > > > > we
>> > > > > > > > > > > > > > > actually
>> > > > > > > > > > > > > > > > care about and need to protect. I also think
>> > this
>> > > > > would
>> > > > > > > be
>> > > > > > > > a
>> > > > > > > > > > very
>> > > > > > > > > > > > > > useful
>> > > > > > > > > > > > > > > > metric even in the absence of throttling
>> just
>> > to
>> > > > > debug
>> > > > > > > > whose
>> > > > > > > > > > > using
>> > > > > > > > > > > > > > > > capacity.
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > Two problems to consider:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >    1. I agree that for the user it is
>> > > > understandable
>> > > > > > what
>> > > > > > > > > lead
>> > > > > > > > > > to
>> > > > > > > > > > > > > their
>> > > > > > > > > > > > > > > >    being throttled, but it is a bit hard to
>> > > figure
>> > > > > out
>> > > > > > > the
>> > > > > > > > > safe
>> > > > > > > > > > > > range
>> > > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > >    them. i.e. if I have a new app that will
>> > send
>> > > > 200
>> > > > > > > > > > > messages/sec I
>> > > > > > > > > > > > > can
>> > > > > > > > > > > > > > > >    probably reason that I'll be under the
>> > > > throttling
>> > > > > > > limit
>> > > > > > > > of
>> > > > > > > > > > 300
>> > > > > > > > > > > > > > > req/sec.
>> > > > > > > > > > > > > > > >    However if I need to be under a 10% CPU
>> > > > resources
>> > > > > > > limit
>> > > > > > > > it
>> > > > > > > > > > may
>> > > > > > > > > > > > be
>> > > > > > > > > > > > > a
>> > > > > > > > > > > > > > > bit
>> > > > > > > > > > > > > > > >    harder for me to know a priori if i will
>> or
>> > > > won't.
>> > > > > > > > > > > > > > > >    2. Calculating the available CPU time is
>> a
>> > bit
>> > > > > > > difficult
>> > > > > > > > > > since
>> > > > > > > > > > > > > there
>> > > > > > > > > > > > > > > are
>> > > > > > > > > > > > > > > >    actually two thread pools--the I/O
>> threads
>> > and
>> > > > the
>> > > > > > > > network
>> > > > > > > > > > > > > threads.
>> > > > > > > > > > > > > > I
>> > > > > > > > > > > > > > > > think
>> > > > > > > > > > > > > > > >    it might be workable to count just the
>> I/O
>> > > > thread
>> > > > > > time
>> > > > > > > > as
>> > > > > > > > > in
>> > > > > > > > > > > the
>> > > > > > > > > > > > > > > > proposal,
>> > > > > > > > > > > > > > > >    but the network thread work is actually
>> > > > > non-trivial
>> > > > > > > > (e.g.
>> > > > > > > > > > all
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > disk
>> > > > > > > > > > > > > > > >    reads for fetches happen in that
>> thread). If
>> > > you
>> > > > > > count
>> > > > > > > > > both
>> > > > > > > > > > > the
>> > > > > > > > > > > > > > > network
>> > > > > > > > > > > > > > > > and
>> > > > > > > > > > > > > > > >    I/O threads it can skew things a bit.
>> E.g.
>> > say
>> > > > you
>> > > > > > > have
>> > > > > > > > 50
>> > > > > > > > > > > > network
>> > > > > > > > > > > > > > > > threads,
>> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores, what is the
>> > > > available
>> > > > > > cpu
>> > > > > > > > > time
>> > > > > > > > > > > > > > available
>> > > > > > > > > > > > > > > > in a
>> > > > > > > > > > > > > > > >    second? I suppose this is a problem
>> whenever
>> > > you
>> > > > > > have
>> > > > > > > a
>> > > > > > > > > > > > bottleneck
>> > > > > > > > > > > > > > > > between
>> > > > > > > > > > > > > > > >    I/O and network threads or if you end up
>> > > > > > significantly
>> > > > > > > > > > > > > > > over-provisioning
>> > > > > > > > > > > > > > > >    one pool (both of which are hard to
>> avoid).
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > An alternative for CPU throttling would be
>> to
>> > use
>> > > > > this
>> > > > > > > api:
>> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
>> > > > > > 1.5.0/docs/api/java/lang/
>> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
>> > > > getThreadCpuTime(long)
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > That would let you track actual CPU usage
>> > across
>> > > > the
>> > > > > > > > network,
>> > > > > > > > > > I/O
>> > > > > > > > > > > > > > > threads,
>> > > > > > > > > > > > > > > > and purgatory threads and look at it as a
>> > > > percentage
>> > > > > of
>> > > > > > > > total
>> > > > > > > > > > > > cores.
>> > > > > > > > > > > > > I
>> > > > > > > > > > > > > > > > think this fixes many problems in the
>> > reliability
>> > > > of
>> > > > > > the
>> > > > > > > > > > metric.
>> > > > > > > > > > > > It's
>> > > > > > > > > > > > > > > > meaning is slightly different as it is just
>> CPU
>> > > > (you
>> > > > > > > don't
>> > > > > > > > > get
>> > > > > > > > > > > > > charged
>> > > > > > > > > > > > > > > for
>> > > > > > > > > > > > > > > > time blocking on I/O) but that may be okay
>> > > because
>> > > > we
>> > > > > > > > already
>> > > > > > > > > > > have
>> > > > > > > > > > > > a
>> > > > > > > > > > > > > > > > throttle on I/O. The downside is I think it
>> is
>> > > > > possible
>> > > > > > > > this
>> > > > > > > > > > api
>> > > > > > > > > > > > can
>> > > > > > > > > > > > > be
>> > > > > > > > > > > > > > > > disabled or isn't always available and it
>> may
>> > > also
>> > > > be
>> > > > > > > > > expensive
>> > > > > > > > > > > > (also
>> > > > > > > > > > > > > > > I've
>> > > > > > > > > > > > > > > > never used it so not sure if it really works
>> > the
>> > > > way
>> > > > > i
>> > > > > > > > > think).
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > -Jay
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin
>> <
>> > > > > > > > > > > becket.qin@gmail.com>
>> > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > If the purpose of the KIP is only to
>> protect
>> > > the
>> > > > > > > cluster
>> > > > > > > > > from
>> > > > > > > > > > > > being
>> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and is not
>> > > intended
>> > > > to
>> > > > > > > > address
>> > > > > > > > > > > > > resource
>> > > > > > > > > > > > > > > > > allocation problem among the clients, I am
>> > > > > wondering
>> > > > > > if
>> > > > > > > > > using
>> > > > > > > > > > > > > request
>> > > > > > > > > > > > > > > > > handling time quota (CPU time quota) is a
>> > > better
>> > > > > > > option.
>> > > > > > > > > Here
>> > > > > > > > > > > are
>> > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > reasons:
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 1. request handling time quota has better
>> > > > > protection.
>> > > > > > > Say
>> > > > > > > > > we
>> > > > > > > > > > > have
>> > > > > > > > > > > > > > > request
>> > > > > > > > > > > > > > > > > rate quota and set that to some value like
>> > 100
>> > > > > > > > > requests/sec,
>> > > > > > > > > > it
>> > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > possible
>> > > > > > > > > > > > > > > > > that some of the requests are very
>> expensive
>> > > > > actually
>> > > > > > > > take
>> > > > > > > > > a
>> > > > > > > > > > > lot
>> > > > > > > > > > > > of
>> > > > > > > > > > > > > > > time
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > handle. In that case a few clients may
>> still
>> > > > > occupy a
>> > > > > > > lot
>> > > > > > > > > of
>> > > > > > > > > > > CPU
>> > > > > > > > > > > > > time
>> > > > > > > > > > > > > > > > even
>> > > > > > > > > > > > > > > > > the request rate is low. Arguably we can
>> > > > carefully
>> > > > > > set
>> > > > > > > > > > request
>> > > > > > > > > > > > rate
>> > > > > > > > > > > > > > > quota
>> > > > > > > > > > > > > > > > > for each request and client id
>> combination,
>> > but
>> > > > it
>> > > > > > > could
>> > > > > > > > > > still
>> > > > > > > > > > > be
>> > > > > > > > > > > > > > > tricky
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > get it right for everyone.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > If we use the request time handling
>> quota, we
>> > > can
>> > > > > > > simply
>> > > > > > > > > say
>> > > > > > > > > > no
>> > > > > > > > > > > > > > clients
>> > > > > > > > > > > > > > > > can
>> > > > > > > > > > > > > > > > > take up to more than 30% of the total
>> request
>> > > > > > handling
>> > > > > > > > > > capacity
>> > > > > > > > > > > > > > > (measured
>> > > > > > > > > > > > > > > > > by time), regardless of the difference
>> among
>> > > > > > different
>> > > > > > > > > > requests
>> > > > > > > > > > > > or
>> > > > > > > > > > > > > > what
>> > > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > the client doing. In this case maybe we
>> can
>> > > quota
>> > > > > all
>> > > > > > > the
>> > > > > > > > > > > > requests
>> > > > > > > > > > > > > if
>> > > > > > > > > > > > > > > we
>> > > > > > > > > > > > > > > > > want to.
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > 2. The main benefit of using request rate
>> > limit
>> > > > is
>> > > > > > that
>> > > > > > > > it
>> > > > > > > > > > > seems
>> > > > > > > > > > > > > more
>> > > > > > > > > > > > > > > > > intuitive. It is true that it is probably
>> > > easier
>> > > > to
>> > > > > > > > explain
>> > > > > > > > > > to
>> > > > > > > > > > > > the
>> > > > > > > > > > > > > > user
>> > > > > > > > > > > > > > > > > what does that mean. However, in practice
>> it
>> > > > looks
>> > > > > > the
>> > > > > > > > > impact
>> > > > > > > > > > > of
>> > > > > > > > > > > > > > > request
>> > > > > > > > > > > > > > > > > rate quota is not more quantifiable than
>> the
>> > > > > request
>> > > > > > > > > handling
>> > > > > > > > > > > > time
>> > > > > > > > > > > > > > > quota.
>> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is still
>> > > difficult
>> > > > > to
>> > > > > > > > give a
>> > > > > > > > > > > > number
>> > > > > > > > > > > > > > > about
>> > > > > > > > > > > > > > > > > impact of throughput or latency when a
>> > request
>> > > > rate
>> > > > > > > quota
>> > > > > > > > > is
>> > > > > > > > > > > hit.
>> > > > > > > > > > > > > So
>> > > > > > > > > > > > > > it
>> > > > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > not better than the request handling time
>> > > quota.
>> > > > In
>> > > > > > > fact
>> > > > > > > > I
>> > > > > > > > > > feel
>> > > > > > > > > > > > it
>> > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > clearer to tell user that "you are limited
>> > > > because
>> > > > > > you
>> > > > > > > > have
>> > > > > > > > > > > taken
>> > > > > > > > > > > > > 30%
>> > > > > > > > > > > > > > > of
>> > > > > > > > > > > > > > > > > the CPU time on the broker" than otherwise
>> > > > > something
>> > > > > > > like
>> > > > > > > > > > "your
>> > > > > > > > > > > > > > request
>> > > > > > > > > > > > > > > > > rate quota on metadata request has
>> reached".
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Thanks,
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay
>> Kreps <
>> > > > > > > > > jay@confluent.io
>> > > > > > > > > > >
>> > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > I think this proposal makes a lot of
>> sense
>> > > > > > > (especially
>> > > > > > > > > now
>> > > > > > > > > > > that
>> > > > > > > > > > > > > it
>> > > > > > > > > > > > > > is
>> > > > > > > > > > > > > > > > > > oriented around request rate) and fills
>> the
>> > > > > biggest
>> > > > > > > > > > remaining
>> > > > > > > > > > > > gap
>> > > > > > > > > > > > > > in
>> > > > > > > > > > > > > > > > the
>> > > > > > > > > > > > > > > > > > multi-tenancy story.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > I think for intra-cluster communication
>> > > > > > (StopReplica,
>> > > > > > > > > etc)
>> > > > > > > > > > we
>> > > > > > > > > > > > > could
>> > > > > > > > > > > > > > > > avoid
>> > > > > > > > > > > > > > > > > > throttling entirely. You can secure or
>> > > > otherwise
>> > > > > > > > > lock-down
>> > > > > > > > > > > the
>> > > > > > > > > > > > > > > cluster
>> > > > > > > > > > > > > > > > > > communication to avoid any unauthorized
>> > > > external
>> > > > > > > party
>> > > > > > > > > from
>> > > > > > > > > > > > > trying
>> > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > initiate these requests. As a result we
>> are
>> > > as
>> > > > > > likely
>> > > > > > > > to
>> > > > > > > > > > > cause
>> > > > > > > > > > > > > > > problems
>> > > > > > > > > > > > > > > > > as
>> > > > > > > > > > > > > > > > > > solve them by throttling these, right?
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > I'm not so sure that we should exempt
>> the
>> > > > > consumer
>> > > > > > > > > requests
>> > > > > > > > > > > > such
>> > > > > > > > > > > > > as
>> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we
>> throttle an
>> > > > app's
>> > > > > > > > > heartbeat
>> > > > > > > > > > > > > > requests
>> > > > > > > > > > > > > > > it
>> > > > > > > > > > > > > > > > > may
>> > > > > > > > > > > > > > > > > > cause it to fall out of its consumer
>> group.
>> > > > > However
>> > > > > > > if
>> > > > > > > > we
>> > > > > > > > > > > don't
>> > > > > > > > > > > > > > > > throttle
>> > > > > > > > > > > > > > > > > it
>> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the heartbeat
>> > > > interval
>> > > > > > is
>> > > > > > > > set
>> > > > > > > > > > > > > > incorrectly
>> > > > > > > > > > > > > > > or
>> > > > > > > > > > > > > > > > > if
>> > > > > > > > > > > > > > > > > > some client in some language has a bug.
>> I
>> > > think
>> > > > > the
>> > > > > > > > > policy
>> > > > > > > > > > > with
>> > > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > kind
>> > > > > > > > > > > > > > > > > > of throttling is to protect the cluster
>> > above
>> > > > any
>> > > > > > > > > > individual
>> > > > > > > > > > > > app,
>> > > > > > > > > > > > > > > > right?
>> > > > > > > > > > > > > > > > > I
>> > > > > > > > > > > > > > > > > > think in general this should be okay
>> since
>> > > for
>> > > > > most
>> > > > > > > > > > > deployments
>> > > > > > > > > > > > > > this
>> > > > > > > > > > > > > > > > > > setting is meant as more of a safety
>> > > > valve---that
>> > > > > > is
>> > > > > > > > > rather
>> > > > > > > > > > > > than
>> > > > > > > > > > > > > > set
>> > > > > > > > > > > > > > > > > > something very close to what you expect
>> to
>> > > need
>> > > > > > (say
>> > > > > > > 2
>> > > > > > > > > > > req/sec
>> > > > > > > > > > > > or
>> > > > > > > > > > > > > > > > > whatever)
>> > > > > > > > > > > > > > > > > > you would have something quite high
>> (like
>> > 100
>> > > > > > > req/sec)
>> > > > > > > > > with
>> > > > > > > > > > > > this
>> > > > > > > > > > > > > > > meant
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I think
>> when
>> > > used
>> > > > > this
>> > > > > > > way
>> > > > > > > > > > > > allowing
>> > > > > > > > > > > > > > > those
>> > > > > > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > > > be throttled would actually provide
>> > > meaningful
>> > > > > > > > > protection.
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > -Jay
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini
>> > > > Sivaram <
>> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > wrote:
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Hi all,
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > I have just created KIP-124 to
>> introduce
>> > > > > request
>> > > > > > > rate
>> > > > > > > > > > > quotas
>> > > > > > > > > > > > to
>> > > > > > > > > > > > > > > > Kafka:
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
>> > > > > > > > confluence/display/KAFKA/KIP-
>> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > The proposal is for a simple
>> percentage
>> > > > request
>> > > > > > > > > handling
>> > > > > > > > > > > time
>> > > > > > > > > > > > > > quota
>> > > > > > > > > > > > > > > > > that
>> > > > > > > > > > > > > > > > > > > can be allocated to *<client-id>*,
>> > *<user>*
>> > > > or
>> > > > > > > > *<user,
>> > > > > > > > > > > > > > client-id>*.
>> > > > > > > > > > > > > > > > > There
>> > > > > > > > > > > > > > > > > > > are a few other suggestions also under
>> > > > > "Rejected
>> > > > > > > > > > > > alternatives".
>> > > > > > > > > > > > > > > > > Feedback
>> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Thank you...
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Regards,
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > > > Rajini
>> > > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > > > --
>> > > > > > > > > > > > > > > -- Guozhang
>> > > > > > > > > > > > > > >
>> > > > > > > > > > > > > >
>> > > > > > > > > > > > >
>> > > > > > > > > > > >
>> > > > > > > > > > >
>> > > > > > > > > >
>> > > > > > > > >
>> > > > > > > >
>> > > > > > >
>> > > > > >
>> > > > > >
>> > > > > >
>> > > > > > --
>> > > > > > -- Guozhang
>> > > > > >
>> > > > >
>> > > >
>> > >
>> >
>>
>
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Dong Lin <li...@gmail.com>.
Hey Jun,

Yeah you are right. I thought it wasn't because at LinkedIn it will be too
much pressure on inGraph to expose those per-clientId metrics so we ended
up printing them periodically to local log. Never mind if it is not a
general problem.

Hey Rajini,

- I agree with Jay that we probably don't want to add a new field for every
quota ProduceResponse or FetchResponse. Is there any use-case for having
separate throttle-time fields for byte-rate-quota and io-thread-unit-quota?
You probably need to document this as interface change if you plan to add
new field in any request.

- I don't think IOThread belongs to quotaType. The existing quota types
(i.e. Produce/Fetch/LeaderReplication/FollowerReplication) identify the
type of request that are throttled, not the quota mechanism that is applied.

- If a request is throttled due to this io-thread-unit-based quota, is the
existing queue-size metric in ClientQuotaManager incremented?

- In the interest of providing guide line for admin to decide
io-thread-unit-based quota and for user to understand its impact on their
traffic, would it be useful to have a metric that shows the overall
byte-rate per io-thread-unit? Can we also show this a per-clientId metric?

Thanks,
Dong


On Thu, Feb 23, 2017 at 9:25 AM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Ismael,
>
> For #3, typically, an admin won't configure more io threads than CPU cores,
> but it's possible for an admin to start with fewer io threads than cores
> and grow that later on.
>
> Hi, Dong,
>
> I think the throttleTime sensor on the broker tells the admin whether a
> user/clentId is throttled or not.
>
> Hi, Radi,
>
> The reasoning for delaying the throttled requests on the broker instead of
> returning an error immediately is that the latter has no way to prevent the
> client from retrying immediately, which will make things worse. The
> delaying logic is based off a delay queue. A separate expiration thread
> just waits on the next to be expired request. So, it doesn't tie up a
> request handler thread.
>
> Thanks,
>
> Jun
>
> On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <is...@juma.me.uk> wrote:
>
> > Hi Jay,
> >
> > Regarding 1, I definitely like the simplicity of keeping a single
> throttle
> > time field in the response. The downside is that the client metrics will
> be
> > more coarse grained.
> >
> > Regarding 3, we have `leader.imbalance.per.broker.percentage` and
> > `log.cleaner.min.cleanable.ratio`.
> >
> > Ismael
> >
> > On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io> wrote:
> >
> > > A few minor comments:
> > >
> > >    1. Isn't it the case that the throttling time response field should
> > have
> > >    the total time your request was throttled irrespective of the quotas
> > > that
> > >    caused that. Limiting it to byte rate quota doesn't make sense, but
> I
> > > also
> > >    I don't think we want to end up adding new fields in the response
> for
> > > every
> > >    single thing we quota, right?
> > >    2. I don't think we should make this quota specifically about io
> > >    threads. Once we introduce these quotas people set them and expect
> > them
> > > to
> > >    be enforced (and if they aren't it may cause an outage). As a result
> > > they
> > >    are a bit more sensitive than normal configs, I think. The current
> > > thread
> > >    pools seem like something of an implementation detail and not the
> > level
> > > the
> > >    user-facing quotas should be involved with. I think it might be
> better
> > > to
> > >    make this a general request-time throttle with no mention in the
> > naming
> > >    about I/O threads and simply acknowledge the current limitation
> (which
> > > we
> > >    may someday fix) in the docs that this covers only the time after
> the
> > >    thread is read off the network.
> > >    3. As such I think the right interface to the user would be
> something
> > >    like percent_request_time and be in {0,...100} or request_time_ratio
> > > and be
> > >    in {0.0,...,1.0} (I think "ratio" is the terminology we used if the
> > > scale
> > >    is between 0 and 1 in the other metrics, right?)
> > >
> > > -Jay
> > >
> > > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <
> rajinisivaram@gmail.com
> > >
> > > wrote:
> > >
> > > > Guozhang/Dong,
> > > >
> > > > Thank you for the feedback.
> > > >
> > > > Guozhang : I have updated the section on co-existence of byte rate
> and
> > > > request time quotas.
> > > >
> > > > Dong: I hadn't added much detail to the metrics and sensors since
> they
> > > are
> > > > going to be very similar to the existing metrics and sensors. To
> avoid
> > > > confusion, I have now added more detail. All metrics are in the group
> > > > "quotaType" and all sensors have names starting with "quotaType"
> (where
> > > > quotaType is Produce/Fetch/LeaderReplication/
> > > > FollowerReplication/*IOThread*).
> > > > So there will be no reuse of existing metrics/sensors. The new ones
> for
> > > > request processing time based throttling will be completely
> independent
> > > of
> > > > existing metrics/sensors, but will be consistent in format.
> > > >
> > > > The existing throttle_time_ms field in produce/fetch responses will
> not
> > > be
> > > > impacted by this KIP. That will continue to return byte-rate based
> > > > throttling times. In addition, a new field request_throttle_time_ms
> > will
> > > be
> > > > added to return request quota based throttling times. These will be
> > > exposed
> > > > as new metrics on the client-side.
> > > >
> > > > Since all metrics and sensors are different for each type of quota, I
> > > > believe there is already sufficient metrics to monitor throttling on
> > both
> > > > client and broker side for each type of throttling.
> > > >
> > > > Regards,
> > > >
> > > > Rajini
> > > >
> > > >
> > > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <li...@gmail.com>
> wrote:
> > > >
> > > > > Hey Rajini,
> > > > >
> > > > > I think it makes a lot of sense to use io_thread_units as metric to
> > > quota
> > > > > user's traffic here. LGTM overall. I have some questions regarding
> > > > sensors.
> > > > >
> > > > > - Can you be more specific in the KIP what sensors will be added?
> For
> > > > > example, it will be useful to specify the name and attributes of
> > these
> > > > new
> > > > > sensors.
> > > > >
> > > > > - We currently have throttle-time and queue-size for byte-rate
> based
> > > > quota.
> > > > > Are you going to have separate throttle-time and queue-size for
> > > requests
> > > > > throttled by io_thread_unit-based quota, or will they share the
> same
> > > > > sensor?
> > > > >
> > > > > - Does the throttle-time in the ProduceResponse and FetchResponse
> > > > contains
> > > > > time due to io_thread_unit-based quota?
> > > > >
> > > > > - Currently kafka server doesn't not provide any log or metrics
> that
> > > > tells
> > > > > whether any given clientId (or user) is throttled. This is not too
> > bad
> > > > > because we can still check the client-side byte-rate metric to
> > validate
> > > > > whether a given client is throttled. But with this io_thread_unit,
> > > there
> > > > > will be no way to validate whether a given client is slow because
> it
> > > has
> > > > > exceeded its io_thread_unit limit. It is necessary for user to be
> > able
> > > to
> > > > > know this information to figure how whether they have reached there
> > > quota
> > > > > limit. How about we add log4j log on the server side to
> periodically
> > > > print
> > > > > the (client_id, byte-rate-throttle-time,
> > io-thread-unit-throttle-time)
> > > so
> > > > > that kafka administrator can figure those users that have reached
> > their
> > > > > limit and act accordingly?
> > > > >
> > > > > Thanks,
> > > > > Dong
> > > > >
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <wangguoz@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > Made a pass over the doc, overall LGTM except a minor comment on
> > the
> > > > > > throttling implementation:
> > > > > >
> > > > > > Stated as "Request processing time throttling will be applied on
> > top
> > > if
> > > > > > necessary." I thought that it meant the request processing time
> > > > > throttling
> > > > > > is applied first, but continue reading I found it actually meant
> to
> > > > apply
> > > > > > produce / fetch byte rate throttling first.
> > > > > >
> > > > > > Also the last sentence "The remaining delay if any is applied to
> > the
> > > > > > response." is a bit confusing to me. Maybe rewording it a bit?
> > > > > >
> > > > > >
> > > > > > Guozhang
> > > > > >
> > > > > >
> > > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <ju...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Hi, Rajini,
> > > > > > >
> > > > > > > Thanks for the updated KIP. The latest proposal looks good to
> me.
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > > > > rajinisivaram@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Jun/Roger,
> > > > > > > >
> > > > > > > > Thank you for the feedback.
> > > > > > > >
> > > > > > > > 1. I have updated the KIP to use absolute units instead of
> > > > > percentage.
> > > > > > > The
> > > > > > > > property is called* io_thread_units* to align with the thread
> > > count
> > > > > > > > property *num.io.threads*. When we implement network thread
> > > > > utilization
> > > > > > > > quotas, we can add another property *network_thread_units.*
> > > > > > > >
> > > > > > > > 2. ControlledShutdown is already listed under the exempt
> > > requests.
> > > > > Jun,
> > > > > > > did
> > > > > > > > you mean a different request that needs to be added? The four
> > > > > requests
> > > > > > > > currently exempt in the KIP are StopReplica,
> > ControlledShutdown,
> > > > > > > > LeaderAndIsr and UpdateMetadata. These are controlled using
> > > > > > ClusterAction
> > > > > > > > ACL, so it is easy to exclude and only throttle if
> > unauthorized.
> > > I
> > > > > > wasn't
> > > > > > > > sure if there are other requests used only for inter-broker
> > that
> > > > > needed
> > > > > > > to
> > > > > > > > be excluded.
> > > > > > > >
> > > > > > > > 3. I was thinking the smallest change would be to replace all
> > > > > > references
> > > > > > > to
> > > > > > > > *requestChannel.sendResponse()* with a local method
> > > > > > > > *sendResponseMaybeThrottle()* that does the throttling if any
> > > plus
> > > > > send
> > > > > > > > response. If we throttle first in *KafkaApis.handle()*, the
> > time
> > > > > spent
> > > > > > > > within the method handling the request will not be recorded
> or
> > > used
> > > > > in
> > > > > > > > throttling. We can look into this again when the PR is ready
> > for
> > > > > > review.
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > Rajini
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > > > > roger.hoover@gmail.com>
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Great to see this KIP and the excellent discussion.
> > > > > > > > >
> > > > > > > > > To me, Jun's suggestion makes sense.  If my application is
> > > > > allocated
> > > > > > 1
> > > > > > > > > request handler unit, then it's as if I have a Kafka broker
> > > with
> > > > a
> > > > > > > single
> > > > > > > > > request handler thread dedicated to me.  That's the most I
> > can
> > > > use,
> > > > > > at
> > > > > > > > > least.  That allocation doesn't change even if an admin
> later
> > > > > > increases
> > > > > > > > the
> > > > > > > > > size of the request thread pool on the broker.  It's
> similar
> > to
> > > > the
> > > > > > CPU
> > > > > > > > > abstraction that VMs and containers get from hypervisors or
> > OS
> > > > > > > > schedulers.
> > > > > > > > > While different client access patterns can use wildly
> > different
> > > > > > amounts
> > > > > > > > of
> > > > > > > > > request thread resources per request, a given application
> > will
> > > > > > > generally
> > > > > > > > > have a stable access pattern and can figure out empirically
> > how
> > > > > many
> > > > > > > > > "request thread units" it needs to meet it's
> > throughput/latency
> > > > > > goals.
> > > > > > > > >
> > > > > > > > > Cheers,
> > > > > > > > >
> > > > > > > > > Roger
> > > > > > > > >
> > > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <jun@confluent.io
> >
> > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Rajini,
> > > > > > > > > >
> > > > > > > > > > Thanks for the updated KIP. A few more comments.
> > > > > > > > > >
> > > > > > > > > > 1. A concern of request_time_percent is that it's not an
> > > > absolute
> > > > > > > > value.
> > > > > > > > > > Let's say you give a user a 10% limit. If the admin
> doubles
> > > the
> > > > > > > number
> > > > > > > > of
> > > > > > > > > > request handler threads, that user now actually has twice
> > the
> > > > > > > absolute
> > > > > > > > > > capacity. This may confuse people a bit. So, perhaps
> > setting
> > > > the
> > > > > > > quota
> > > > > > > > > > based on an absolute request thread unit is better.
> > > > > > > > > >
> > > > > > > > > > 2. ControlledShutdownRequest is also an inter-broker
> > request
> > > > and
> > > > > > > needs
> > > > > > > > to
> > > > > > > > > > be excluded from throttling.
> > > > > > > > > >
> > > > > > > > > > 3. Implementation wise, I am wondering if it's simpler to
> > > apply
> > > > > the
> > > > > > > > > request
> > > > > > > > > > time throttling first in KafkaApis.handle(). Otherwise,
> we
> > > will
> > > > > > need
> > > > > > > to
> > > > > > > > > add
> > > > > > > > > > the throttling logic in each type of request.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Jun,
> > > > > > > > > > >
> > > > > > > > > > > Thank you for the review.
> > > > > > > > > > >
> > > > > > > > > > > I have reverted to the original KIP that throttles
> based
> > on
> > > > > > request
> > > > > > > > > > handler
> > > > > > > > > > > utilization. At the moment, it uses percentage, but I
> am
> > > > happy
> > > > > to
> > > > > > > > > change
> > > > > > > > > > to
> > > > > > > > > > > a fraction (out of 1 instead of 100) if required. I
> have
> > > > added
> > > > > > the
> > > > > > > > > > examples
> > > > > > > > > > > from this discussion to the KIP. Also added a "Future
> > Work"
> > > > > > section
> > > > > > > > to
> > > > > > > > > > > address network thread utilization. The configuration
> is
> > > > named
> > > > > > > > > > > "request_time_percent" with the expectation that it can
> > > also
> > > > be
> > > > > > > used
> > > > > > > > as
> > > > > > > > > > the
> > > > > > > > > > > limit for network thread utilization when that is
> > > > implemented,
> > > > > so
> > > > > > > > that
> > > > > > > > > > > users have to set only one config for the two and not
> > have
> > > to
> > > > > > worry
> > > > > > > > > about
> > > > > > > > > > > the internal distribution of the work between the two
> > > thread
> > > > > > pools
> > > > > > > in
> > > > > > > > > > > Kafka.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > >
> > > > > > > > > > > Rajini
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
> > > jun@confluent.io>
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks for the proposal.
> > > > > > > > > > > >
> > > > > > > > > > > > The benefit of using the request processing time over
> > the
> > > > > > request
> > > > > > > > > rate
> > > > > > > > > > is
> > > > > > > > > > > > exactly what people have said. I will just expand
> that
> > a
> > > > bit.
> > > > > > > > > Consider
> > > > > > > > > > > the
> > > > > > > > > > > > following case. The producer sends a produce request
> > > with a
> > > > > > 10MB
> > > > > > > > > > message
> > > > > > > > > > > > but compressed to 100KB with gzip. The decompression
> of
> > > the
> > > > > > > message
> > > > > > > > > on
> > > > > > > > > > > the
> > > > > > > > > > > > broker could take 10-15 seconds, during which time, a
> > > > request
> > > > > > > > handler
> > > > > > > > > > > > thread is completely blocked. In this case, neither
> the
> > > > > byte-in
> > > > > > > > quota
> > > > > > > > > > nor
> > > > > > > > > > > > the request rate quota may be effective in protecting
> > the
> > > > > > broker.
> > > > > > > > > > > Consider
> > > > > > > > > > > > another case. A consumer group starts with 10
> instances
> > > and
> > > > > > later
> > > > > > > > on
> > > > > > > > > > > > switches to 20 instances. The request rate will
> likely
> > > > > double,
> > > > > > > but
> > > > > > > > > the
> > > > > > > > > > > > actually load on the broker may not double since each
> > > fetch
> > > > > > > request
> > > > > > > > > > only
> > > > > > > > > > > > contains half of the partitions. Request rate quota
> may
> > > not
> > > > > be
> > > > > > > easy
> > > > > > > > > to
> > > > > > > > > > > > configure in this case.
> > > > > > > > > > > >
> > > > > > > > > > > > What we really want is to be able to prevent a client
> > > from
> > > > > > using
> > > > > > > > too
> > > > > > > > > > much
> > > > > > > > > > > > of the server side resources. In this particular KIP,
> > > this
> > > > > > > resource
> > > > > > > > > is
> > > > > > > > > > > the
> > > > > > > > > > > > capacity of the request handler threads. I agree that
> > it
> > > > may
> > > > > > not
> > > > > > > be
> > > > > > > > > > > > intuitive for the users to determine how to set the
> > right
> > > > > > limit.
> > > > > > > > > > However,
> > > > > > > > > > > > this is not completely new and has been done in the
> > > > container
> > > > > > > world
> > > > > > > > > > > > already. For example, Linux cgroup (
> > > > > https://access.redhat.com/
> > > > > > > > > > > > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > > > > > > > > > > > Resource_Management_Guide/sec-cpu.html) has the
> > concept
> > > of
> > > > > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > > > > which specifies the total amount of time in
> > microseconds
> > > > for
> > > > > > > which
> > > > > > > > > all
> > > > > > > > > > > > tasks in a cgroup can run during a one second period.
> > We
> > > > can
> > > > > > > > > > potentially
> > > > > > > > > > > > model the request handler threads in a similar way.
> For
> > > > > > example,
> > > > > > > > each
> > > > > > > > > > > > request handler thread can be 1 request handler unit
> > and
> > > > the
> > > > > > > admin
> > > > > > > > > can
> > > > > > > > > > > > configure a limit on how many units (say 0.01) a
> client
> > > can
> > > > > > have.
> > > > > > > > > > > >
> > > > > > > > > > > > Regarding not throttling the internal broker to
> broker
> > > > > > requests.
> > > > > > > We
> > > > > > > > > > could
> > > > > > > > > > > > do that. Alternatively, we could just let the admin
> > > > > configure a
> > > > > > > > high
> > > > > > > > > > > limit
> > > > > > > > > > > > for the kafka user (it may not be able to do that
> > easily
> > > > > based
> > > > > > on
> > > > > > > > > > > clientId
> > > > > > > > > > > > though).
> > > > > > > > > > > >
> > > > > > > > > > > > Ideally we want to be able to protect the utilization
> > of
> > > > the
> > > > > > > > network
> > > > > > > > > > > thread
> > > > > > > > > > > > pool too. The difficult is mostly what Rajini said:
> (1)
> > > The
> > > > > > > > mechanism
> > > > > > > > > > for
> > > > > > > > > > > > throttling the requests is through Purgatory and we
> > will
> > > > have
> > > > > > to
> > > > > > > > > think
> > > > > > > > > > > > through how to integrate that into the network layer.
> > > (2)
> > > > In
> > > > > > the
> > > > > > > > > > network
> > > > > > > > > > > > layer, currently we know the user, but not the
> clientId
> > > of
> > > > > the
> > > > > > > > > request.
> > > > > > > > > > > So,
> > > > > > > > > > > > it's a bit tricky to throttle based on clientId
> there.
> > > > Plus,
> > > > > > the
> > > > > > > > > > byteOut
> > > > > > > > > > > > quota can already protect the network thread
> > utilization
> > > > for
> > > > > > > fetch
> > > > > > > > > > > > requests. So, if we can't figure out this part right
> > now,
> > > > > just
> > > > > > > > > focusing
> > > > > > > > > > > on
> > > > > > > > > > > > the request handling threads for this KIP is still a
> > > useful
> > > > > > > > feature.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jun
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Thank you all for the feedback.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jay: I have removed exemption for consumer
> heartbeat
> > > etc.
> > > > > > Agree
> > > > > > > > > that
> > > > > > > > > > > > > protecting the cluster is more important than
> > > protecting
> > > > > > > > individual
> > > > > > > > > > > apps.
> > > > > > > > > > > > > Have retained the exemption for
> > > StopReplicat/LeaderAndIsr
> > > > > > etc,
> > > > > > > > > these
> > > > > > > > > > > are
> > > > > > > > > > > > > throttled only if authorization fails (so can't be
> > used
> > > > for
> > > > > > DoS
> > > > > > > > > > attacks
> > > > > > > > > > > > in
> > > > > > > > > > > > > a secure cluster, but allows inter-broker requests
> to
> > > > > > complete
> > > > > > > > > > without
> > > > > > > > > > > > > delays).
> > > > > > > > > > > > >
> > > > > > > > > > > > > I will wait another day to see if these is any
> > > objection
> > > > to
> > > > > > > > quotas
> > > > > > > > > > > based
> > > > > > > > > > > > on
> > > > > > > > > > > > > request processing time (as opposed to request
> rate)
> > > and
> > > > if
> > > > > > > there
> > > > > > > > > are
> > > > > > > > > > > no
> > > > > > > > > > > > > objections, I will revert to the original proposal
> > with
> > > > > some
> > > > > > > > > changes.
> > > > > > > > > > > > >
> > > > > > > > > > > > > The original proposal was only including the time
> > used
> > > by
> > > > > the
> > > > > > > > > request
> > > > > > > > > > > > > handler threads (that made calculation easy). I
> think
> > > the
> > > > > > > > > suggestion
> > > > > > > > > > is
> > > > > > > > > > > > to
> > > > > > > > > > > > > include the time spent in the network threads as
> well
> > > > since
> > > > > > > that
> > > > > > > > > may
> > > > > > > > > > be
> > > > > > > > > > > > > significant. As Jay pointed out, it is more
> > complicated
> > > > to
> > > > > > > > > calculate
> > > > > > > > > > > the
> > > > > > > > > > > > > total available CPU time and convert to a ratio
> when
> > > > there
> > > > > > *m*
> > > > > > > > I/O
> > > > > > > > > > > > threads
> > > > > > > > > > > > > and *n* network threads.
> > ThreadMXBean#getThreadCPUTime(
> > > )
> > > > > may
> > > > > > > > give
> > > > > > > > > us
> > > > > > > > > > > > what
> > > > > > > > > > > > > we want, but it can be very expensive on some
> > > platforms.
> > > > As
> > > > > > > > Becket
> > > > > > > > > > and
> > > > > > > > > > > > > Guozhang have pointed out, we do have several time
> > > > > > measurements
> > > > > > > > > > already
> > > > > > > > > > > > for
> > > > > > > > > > > > > generating metrics that we could use, though we
> might
> > > > want
> > > > > to
> > > > > > > > > switch
> > > > > > > > > > to
> > > > > > > > > > > > > nanoTime() instead of currentTimeMillis() since
> some
> > of
> > > > the
> > > > > > > > values
> > > > > > > > > > for
> > > > > > > > > > > > > small requests may be < 1ms. But rather than add up
> > the
> > > > > time
> > > > > > > > spent
> > > > > > > > > in
> > > > > > > > > > > I/O
> > > > > > > > > > > > > thread and network thread, wouldn't it be better to
> > > > convert
> > > > > > the
> > > > > > > > > time
> > > > > > > > > > > > spent
> > > > > > > > > > > > > on each thread into a separate ratio? UserA has a
> > > request
> > > > > > quota
> > > > > > > > of
> > > > > > > > > > 5%.
> > > > > > > > > > > > Can
> > > > > > > > > > > > > we take that to mean that UserA can use 5% of the
> > time
> > > on
> > > > > > > network
> > > > > > > > > > > threads
> > > > > > > > > > > > > and 5% of the time on I/O threads? If either is
> > > exceeded,
> > > > > the
> > > > > > > > > > response
> > > > > > > > > > > is
> > > > > > > > > > > > > throttled - it would mean maintaining two sets of
> > > metrics
> > > > > for
> > > > > > > the
> > > > > > > > > two
> > > > > > > > > > > > > durations, but would result in more meaningful
> > ratios.
> > > We
> > > > > > could
> > > > > > > > > > define
> > > > > > > > > > > > two
> > > > > > > > > > > > > quota limits (UserA has 5% of request threads and
> 10%
> > > of
> > > > > > > network
> > > > > > > > > > > > threads),
> > > > > > > > > > > > > but that seems unnecessary and harder to explain to
> > > > users.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Back to why and how quotas are applied to network
> > > thread
> > > > > > > > > utilization:
> > > > > > > > > > > > > a) In the case of fetch,  the time spent in the
> > network
> > > > > > thread
> > > > > > > > may
> > > > > > > > > be
> > > > > > > > > > > > > significant and I can see the need to include this.
> > Are
> > > > > there
> > > > > > > > other
> > > > > > > > > > > > > requests where the network thread utilization is
> > > > > significant?
> > > > > > > In
> > > > > > > > > the
> > > > > > > > > > > case
> > > > > > > > > > > > > of fetch, request handler thread utilization would
> > > > throttle
> > > > > > > > clients
> > > > > > > > > > > with
> > > > > > > > > > > > > high request rate, low data volume and fetch byte
> > rate
> > > > > quota
> > > > > > > will
> > > > > > > > > > > > throttle
> > > > > > > > > > > > > clients with high data volume. Network thread
> > > utilization
> > > > > is
> > > > > > > > > perhaps
> > > > > > > > > > > > > proportional to the data volume. I am wondering if
> we
> > > > even
> > > > > > need
> > > > > > > > to
> > > > > > > > > > > > throttle
> > > > > > > > > > > > > based on network thread utilization or whether the
> > data
> > > > > > volume
> > > > > > > > > quota
> > > > > > > > > > > > covers
> > > > > > > > > > > > > this case.
> > > > > > > > > > > > >
> > > > > > > > > > > > > b) At the moment, we record and check for quota
> > > violation
> > > > > at
> > > > > > > the
> > > > > > > > > same
> > > > > > > > > > > > time.
> > > > > > > > > > > > > If a quota is violated, the response is delayed.
> > Using
> > > > > Jay'e
> > > > > > > > > example
> > > > > > > > > > of
> > > > > > > > > > > > > disk reads for fetches happening in the network
> > thread,
> > > > We
> > > > > > > can't
> > > > > > > > > > record
> > > > > > > > > > > > and
> > > > > > > > > > > > > delay a response after the disk reads. We could
> > record
> > > > the
> > > > > > time
> > > > > > > > > spent
> > > > > > > > > > > on
> > > > > > > > > > > > > the network thread when the response is complete
> and
> > > > > > introduce
> > > > > > > a
> > > > > > > > > > delay
> > > > > > > > > > > > for
> > > > > > > > > > > > > handling a subsequent request (separate out
> recording
> > > and
> > > > > > quota
> > > > > > > > > > > violation
> > > > > > > > > > > > > handling in the case of network thread overload).
> > Does
> > > > that
> > > > > > > make
> > > > > > > > > > sense?
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hey Jay,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Yeah, I agree that enforcing the CPU time is a
> > little
> > > > > > > tricky. I
> > > > > > > > > am
> > > > > > > > > > > > > thinking
> > > > > > > > > > > > > > that maybe we can use the existing request
> > > statistics.
> > > > > They
> > > > > > > are
> > > > > > > > > > > already
> > > > > > > > > > > > > > very detailed so we can probably see the
> > approximate
> > > > CPU
> > > > > > time
> > > > > > > > > from
> > > > > > > > > > > it,
> > > > > > > > > > > > > e.g.
> > > > > > > > > > > > > > something like (total_time -
> > > > request/response_queue_time
> > > > > -
> > > > > > > > > > > > remote_time).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I agree with Guozhang that when a user is
> throttled
> > > it
> > > > is
> > > > > > > > likely
> > > > > > > > > > that
> > > > > > > > > > > > we
> > > > > > > > > > > > > > need to see if anything has went wrong first, and
> > if
> > > > the
> > > > > > > users
> > > > > > > > > are
> > > > > > > > > > > well
> > > > > > > > > > > > > > behaving and just need more resources, we will
> have
> > > to
> > > > > bump
> > > > > > > up
> > > > > > > > > the
> > > > > > > > > > > > quota
> > > > > > > > > > > > > > for them. It is true that pre-allocating CPU time
> > > quota
> > > > > > > > precisely
> > > > > > > > > > for
> > > > > > > > > > > > the
> > > > > > > > > > > > > > users is difficult. So in practice it would
> > probably
> > > be
> > > > > > more
> > > > > > > > like
> > > > > > > > > > > first
> > > > > > > > > > > > > set
> > > > > > > > > > > > > > a relative high protective CPU time quota for
> > > everyone
> > > > > and
> > > > > > > > > increase
> > > > > > > > > > > > that
> > > > > > > > > > > > > > for some individual clients on demand.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <
> > > > > > > > > wangguoz@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > This is a great proposal, glad to see it
> > happening.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I am inclined to the CPU throttling, or more
> > > > > specifically
> > > > > > > > > > > processing
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > ratio instead of the request rate throttling as
> > > well.
> > > > > > > Becket
> > > > > > > > > has
> > > > > > > > > > > very
> > > > > > > > > > > > > > well
> > > > > > > > > > > > > > > summed my rationales above, and one thing to
> add
> > > here
> > > > > is
> > > > > > > that
> > > > > > > > > the
> > > > > > > > > > > > > former
> > > > > > > > > > > > > > > has a good support for both "protecting against
> > > rogue
> > > > > > > > clients"
> > > > > > > > > as
> > > > > > > > > > > > well
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > "utilizing a cluster for multi-tenancy usage":
> > when
> > > > > > > thinking
> > > > > > > > > > about
> > > > > > > > > > > > how
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > explain this to the end users, I find it
> actually
> > > > more
> > > > > > > > natural
> > > > > > > > > > than
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > request rate since as mentioned above,
> different
> > > > > requests
> > > > > > > > will
> > > > > > > > > > have
> > > > > > > > > > > > > quite
> > > > > > > > > > > > > > > different "cost", and Kafka today already have
> > > > various
> > > > > > > > request
> > > > > > > > > > > types
> > > > > > > > > > > > > > > (produce, fetch, admin, metadata, etc), because
> > of
> > > > that
> > > > > > the
> > > > > > > > > > request
> > > > > > > > > > > > > rate
> > > > > > > > > > > > > > > throttling may not be as effective unless it is
> > set
> > > > > very
> > > > > > > > > > > > > conservatively.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regarding to user reactions when they are
> > > throttled,
> > > > I
> > > > > > > think
> > > > > > > > it
> > > > > > > > > > may
> > > > > > > > > > > > > > differ
> > > > > > > > > > > > > > > case-by-case, and need to be discovered /
> guided
> > by
> > > > > > looking
> > > > > > > > at
> > > > > > > > > > > > relative
> > > > > > > > > > > > > > > metrics. So in other words users would not
> expect
> > > to
> > > > > get
> > > > > > > > > > additional
> > > > > > > > > > > > > > > information by simply being told "hey, you are
> > > > > > throttled",
> > > > > > > > > which
> > > > > > > > > > is
> > > > > > > > > > > > all
> > > > > > > > > > > > > > > what throttling does; they need to take a
> > follow-up
> > > > > step
> > > > > > > and
> > > > > > > > > see
> > > > > > > > > > > > "hmm,
> > > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > > throttled probably because of ..", which is by
> > > > looking
> > > > > at
> > > > > > > > other
> > > > > > > > > > > > metric
> > > > > > > > > > > > > > > values: e.g. whether I'm bombarding the brokers
> > > with
> > > > > > > metadata
> > > > > > > > > > > > request,
> > > > > > > > > > > > > > > which are usually cheap to handle but I'm
> sending
> > > > > > thousands
> > > > > > > > per
> > > > > > > > > > > > second;
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > is it because I'm catching up and hence sending
> > > very
> > > > > > heavy
> > > > > > > > > > fetching
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > with large min.bytes, etc.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regarding to the implementation, as once
> > discussed
> > > > with
> > > > > > > Jun,
> > > > > > > > > this
> > > > > > > > > > > > seems
> > > > > > > > > > > > > > not
> > > > > > > > > > > > > > > very difficult since today we are already
> > > collecting
> > > > > the
> > > > > > > > > "thread
> > > > > > > > > > > pool
> > > > > > > > > > > > > > > utilization" metrics, which is a single
> > percentage
> > > > > > > > > > > > "aggregateIdleMeter"
> > > > > > > > > > > > > > > value; but we are already effectively
> aggregating
> > > it
> > > > > for
> > > > > > > each
> > > > > > > > > > > > requests
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > KafkaRequestHandler, and we can just extend it
> by
> > > > > > recording
> > > > > > > > the
> > > > > > > > > > > > source
> > > > > > > > > > > > > > > client id when handling them and aggregating by
> > > > > clientId
> > > > > > as
> > > > > > > > > well
> > > > > > > > > > as
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > total aggregate.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Guozhang
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <
> > > > > > > jay@confluent.io
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hey Becket/Rajini,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > When I thought about it more deeply I came
> > around
> > > > to
> > > > > > the
> > > > > > > > > > "percent
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > > processing time" metric too. It seems a lot
> > > closer
> > > > to
> > > > > > the
> > > > > > > > > thing
> > > > > > > > > > > we
> > > > > > > > > > > > > > > actually
> > > > > > > > > > > > > > > > care about and need to protect. I also think
> > this
> > > > > would
> > > > > > > be
> > > > > > > > a
> > > > > > > > > > very
> > > > > > > > > > > > > > useful
> > > > > > > > > > > > > > > > metric even in the absence of throttling just
> > to
> > > > > debug
> > > > > > > > whose
> > > > > > > > > > > using
> > > > > > > > > > > > > > > > capacity.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Two problems to consider:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >    1. I agree that for the user it is
> > > > understandable
> > > > > > what
> > > > > > > > > lead
> > > > > > > > > > to
> > > > > > > > > > > > > their
> > > > > > > > > > > > > > > >    being throttled, but it is a bit hard to
> > > figure
> > > > > out
> > > > > > > the
> > > > > > > > > safe
> > > > > > > > > > > > range
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > >    them. i.e. if I have a new app that will
> > send
> > > > 200
> > > > > > > > > > > messages/sec I
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > > >    probably reason that I'll be under the
> > > > throttling
> > > > > > > limit
> > > > > > > > of
> > > > > > > > > > 300
> > > > > > > > > > > > > > > req/sec.
> > > > > > > > > > > > > > > >    However if I need to be under a 10% CPU
> > > > resources
> > > > > > > limit
> > > > > > > > it
> > > > > > > > > > may
> > > > > > > > > > > > be
> > > > > > > > > > > > > a
> > > > > > > > > > > > > > > bit
> > > > > > > > > > > > > > > >    harder for me to know a priori if i will
> or
> > > > won't.
> > > > > > > > > > > > > > > >    2. Calculating the available CPU time is a
> > bit
> > > > > > > difficult
> > > > > > > > > > since
> > > > > > > > > > > > > there
> > > > > > > > > > > > > > > are
> > > > > > > > > > > > > > > >    actually two thread pools--the I/O threads
> > and
> > > > the
> > > > > > > > network
> > > > > > > > > > > > > threads.
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > > >    it might be workable to count just the I/O
> > > > thread
> > > > > > time
> > > > > > > > as
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > > > > proposal,
> > > > > > > > > > > > > > > >    but the network thread work is actually
> > > > > non-trivial
> > > > > > > > (e.g.
> > > > > > > > > > all
> > > > > > > > > > > > the
> > > > > > > > > > > > > > disk
> > > > > > > > > > > > > > > >    reads for fetches happen in that thread).
> If
> > > you
> > > > > > count
> > > > > > > > > both
> > > > > > > > > > > the
> > > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > > >    I/O threads it can skew things a bit. E.g.
> > say
> > > > you
> > > > > > > have
> > > > > > > > 50
> > > > > > > > > > > > network
> > > > > > > > > > > > > > > > threads,
> > > > > > > > > > > > > > > >    10 I/O threads, and 8 cores, what is the
> > > > available
> > > > > > cpu
> > > > > > > > > time
> > > > > > > > > > > > > > available
> > > > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > > >    second? I suppose this is a problem
> whenever
> > > you
> > > > > > have
> > > > > > > a
> > > > > > > > > > > > bottleneck
> > > > > > > > > > > > > > > > between
> > > > > > > > > > > > > > > >    I/O and network threads or if you end up
> > > > > > significantly
> > > > > > > > > > > > > > > over-provisioning
> > > > > > > > > > > > > > > >    one pool (both of which are hard to
> avoid).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > An alternative for CPU throttling would be to
> > use
> > > > > this
> > > > > > > api:
> > > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> > > > > > 1.5.0/docs/api/java/lang/
> > > > > > > > > > > > > > > > management/ThreadMXBean.html#
> > > > getThreadCpuTime(long)
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > That would let you track actual CPU usage
> > across
> > > > the
> > > > > > > > network,
> > > > > > > > > > I/O
> > > > > > > > > > > > > > > threads,
> > > > > > > > > > > > > > > > and purgatory threads and look at it as a
> > > > percentage
> > > > > of
> > > > > > > > total
> > > > > > > > > > > > cores.
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > think this fixes many problems in the
> > reliability
> > > > of
> > > > > > the
> > > > > > > > > > metric.
> > > > > > > > > > > > It's
> > > > > > > > > > > > > > > > meaning is slightly different as it is just
> CPU
> > > > (you
> > > > > > > don't
> > > > > > > > > get
> > > > > > > > > > > > > charged
> > > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > > time blocking on I/O) but that may be okay
> > > because
> > > > we
> > > > > > > > already
> > > > > > > > > > > have
> > > > > > > > > > > > a
> > > > > > > > > > > > > > > > throttle on I/O. The downside is I think it
> is
> > > > > possible
> > > > > > > > this
> > > > > > > > > > api
> > > > > > > > > > > > can
> > > > > > > > > > > > > be
> > > > > > > > > > > > > > > > disabled or isn't always available and it may
> > > also
> > > > be
> > > > > > > > > expensive
> > > > > > > > > > > > (also
> > > > > > > > > > > > > > > I've
> > > > > > > > > > > > > > > > never used it so not sure if it really works
> > the
> > > > way
> > > > > i
> > > > > > > > > think).
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > -Jay
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> > > > > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > If the purpose of the KIP is only to
> protect
> > > the
> > > > > > > cluster
> > > > > > > > > from
> > > > > > > > > > > > being
> > > > > > > > > > > > > > > > > overwhelmed by crazy clients and is not
> > > intended
> > > > to
> > > > > > > > address
> > > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > > allocation problem among the clients, I am
> > > > > wondering
> > > > > > if
> > > > > > > > > using
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > handling time quota (CPU time quota) is a
> > > better
> > > > > > > option.
> > > > > > > > > Here
> > > > > > > > > > > are
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > reasons:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 1. request handling time quota has better
> > > > > protection.
> > > > > > > Say
> > > > > > > > > we
> > > > > > > > > > > have
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > rate quota and set that to some value like
> > 100
> > > > > > > > > requests/sec,
> > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > possible
> > > > > > > > > > > > > > > > > that some of the requests are very
> expensive
> > > > > actually
> > > > > > > > take
> > > > > > > > > a
> > > > > > > > > > > lot
> > > > > > > > > > > > of
> > > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > handle. In that case a few clients may
> still
> > > > > occupy a
> > > > > > > lot
> > > > > > > > > of
> > > > > > > > > > > CPU
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > > > even
> > > > > > > > > > > > > > > > > the request rate is low. Arguably we can
> > > > carefully
> > > > > > set
> > > > > > > > > > request
> > > > > > > > > > > > rate
> > > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > for each request and client id combination,
> > but
> > > > it
> > > > > > > could
> > > > > > > > > > still
> > > > > > > > > > > be
> > > > > > > > > > > > > > > tricky
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > get it right for everyone.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > If we use the request time handling quota,
> we
> > > can
> > > > > > > simply
> > > > > > > > > say
> > > > > > > > > > no
> > > > > > > > > > > > > > clients
> > > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > > take up to more than 30% of the total
> request
> > > > > > handling
> > > > > > > > > > capacity
> > > > > > > > > > > > > > > (measured
> > > > > > > > > > > > > > > > > by time), regardless of the difference
> among
> > > > > > different
> > > > > > > > > > requests
> > > > > > > > > > > > or
> > > > > > > > > > > > > > what
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > the client doing. In this case maybe we can
> > > quota
> > > > > all
> > > > > > > the
> > > > > > > > > > > > requests
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > > want to.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > 2. The main benefit of using request rate
> > limit
> > > > is
> > > > > > that
> > > > > > > > it
> > > > > > > > > > > seems
> > > > > > > > > > > > > more
> > > > > > > > > > > > > > > > > intuitive. It is true that it is probably
> > > easier
> > > > to
> > > > > > > > explain
> > > > > > > > > > to
> > > > > > > > > > > > the
> > > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > > what does that mean. However, in practice
> it
> > > > looks
> > > > > > the
> > > > > > > > > impact
> > > > > > > > > > > of
> > > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > rate quota is not more quantifiable than
> the
> > > > > request
> > > > > > > > > handling
> > > > > > > > > > > > time
> > > > > > > > > > > > > > > quota.
> > > > > > > > > > > > > > > > > Unlike the byte rate quota, it is still
> > > difficult
> > > > > to
> > > > > > > > give a
> > > > > > > > > > > > number
> > > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > > impact of throughput or latency when a
> > request
> > > > rate
> > > > > > > quota
> > > > > > > > > is
> > > > > > > > > > > hit.
> > > > > > > > > > > > > So
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > not better than the request handling time
> > > quota.
> > > > In
> > > > > > > fact
> > > > > > > > I
> > > > > > > > > > feel
> > > > > > > > > > > > it
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > clearer to tell user that "you are limited
> > > > because
> > > > > > you
> > > > > > > > have
> > > > > > > > > > > taken
> > > > > > > > > > > > > 30%
> > > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > > the CPU time on the broker" than otherwise
> > > > > something
> > > > > > > like
> > > > > > > > > > "your
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > > rate quota on metadata request has
> reached".
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps
> <
> > > > > > > > > jay@confluent.io
> > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I think this proposal makes a lot of
> sense
> > > > > > > (especially
> > > > > > > > > now
> > > > > > > > > > > that
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > > oriented around request rate) and fills
> the
> > > > > biggest
> > > > > > > > > > remaining
> > > > > > > > > > > > gap
> > > > > > > > > > > > > > in
> > > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > > multi-tenancy story.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I think for intra-cluster communication
> > > > > > (StopReplica,
> > > > > > > > > etc)
> > > > > > > > > > we
> > > > > > > > > > > > > could
> > > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > > throttling entirely. You can secure or
> > > > otherwise
> > > > > > > > > lock-down
> > > > > > > > > > > the
> > > > > > > > > > > > > > > cluster
> > > > > > > > > > > > > > > > > > communication to avoid any unauthorized
> > > > external
> > > > > > > party
> > > > > > > > > from
> > > > > > > > > > > > > trying
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > initiate these requests. As a result we
> are
> > > as
> > > > > > likely
> > > > > > > > to
> > > > > > > > > > > cause
> > > > > > > > > > > > > > > problems
> > > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > solve them by throttling these, right?
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I'm not so sure that we should exempt the
> > > > > consumer
> > > > > > > > > requests
> > > > > > > > > > > > such
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > > heartbeat. It's true that if we throttle
> an
> > > > app's
> > > > > > > > > heartbeat
> > > > > > > > > > > > > > requests
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > > cause it to fall out of its consumer
> group.
> > > > > However
> > > > > > > if
> > > > > > > > we
> > > > > > > > > > > don't
> > > > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > > it may DDOS the cluster if the heartbeat
> > > > interval
> > > > > > is
> > > > > > > > set
> > > > > > > > > > > > > > incorrectly
> > > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > > some client in some language has a bug. I
> > > think
> > > > > the
> > > > > > > > > policy
> > > > > > > > > > > with
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > kind
> > > > > > > > > > > > > > > > > > of throttling is to protect the cluster
> > above
> > > > any
> > > > > > > > > > individual
> > > > > > > > > > > > app,
> > > > > > > > > > > > > > > > right?
> > > > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > > think in general this should be okay
> since
> > > for
> > > > > most
> > > > > > > > > > > deployments
> > > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > > setting is meant as more of a safety
> > > > valve---that
> > > > > > is
> > > > > > > > > rather
> > > > > > > > > > > > than
> > > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > > something very close to what you expect
> to
> > > need
> > > > > > (say
> > > > > > > 2
> > > > > > > > > > > req/sec
> > > > > > > > > > > > or
> > > > > > > > > > > > > > > > > whatever)
> > > > > > > > > > > > > > > > > > you would have something quite high (like
> > 100
> > > > > > > req/sec)
> > > > > > > > > with
> > > > > > > > > > > > this
> > > > > > > > > > > > > > > meant
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > prevent a client gone crazy. I think when
> > > used
> > > > > this
> > > > > > > way
> > > > > > > > > > > > allowing
> > > > > > > > > > > > > > > those
> > > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > > be throttled would actually provide
> > > meaningful
> > > > > > > > > protection.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > -Jay
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini
> > > > Sivaram <
> > > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > I have just created KIP-124 to
> introduce
> > > > > request
> > > > > > > rate
> > > > > > > > > > > quotas
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > Kafka:
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
> > > > > > > > confluence/display/KAFKA/KIP-
> > > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > The proposal is for a simple percentage
> > > > request
> > > > > > > > > handling
> > > > > > > > > > > time
> > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > > can be allocated to *<client-id>*,
> > *<user>*
> > > > or
> > > > > > > > *<user,
> > > > > > > > > > > > > > client-id>*.
> > > > > > > > > > > > > > > > > There
> > > > > > > > > > > > > > > > > > > are a few other suggestions also under
> > > > > "Rejected
> > > > > > > > > > > > alternatives".
> > > > > > > > > > > > > > > > > Feedback
> > > > > > > > > > > > > > > > > > > and suggestions are welcome.
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Thank you...
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > --
> > > > > > > > > > > > > > > -- Guozhang
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -- Guozhang
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Ismael,

For #3, typically, an admin won't configure more io threads than CPU cores,
but it's possible for an admin to start with fewer io threads than cores
and grow that later on.

Hi, Dong,

I think the throttleTime sensor on the broker tells the admin whether a
user/clentId is throttled or not.

Hi, Radi,

The reasoning for delaying the throttled requests on the broker instead of
returning an error immediately is that the latter has no way to prevent the
client from retrying immediately, which will make things worse. The
delaying logic is based off a delay queue. A separate expiration thread
just waits on the next to be expired request. So, it doesn't tie up a
request handler thread.

Thanks,

Jun

On Thu, Feb 23, 2017 at 9:07 AM, Ismael Juma <is...@juma.me.uk> wrote:

> Hi Jay,
>
> Regarding 1, I definitely like the simplicity of keeping a single throttle
> time field in the response. The downside is that the client metrics will be
> more coarse grained.
>
> Regarding 3, we have `leader.imbalance.per.broker.percentage` and
> `log.cleaner.min.cleanable.ratio`.
>
> Ismael
>
> On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io> wrote:
>
> > A few minor comments:
> >
> >    1. Isn't it the case that the throttling time response field should
> have
> >    the total time your request was throttled irrespective of the quotas
> > that
> >    caused that. Limiting it to byte rate quota doesn't make sense, but I
> > also
> >    I don't think we want to end up adding new fields in the response for
> > every
> >    single thing we quota, right?
> >    2. I don't think we should make this quota specifically about io
> >    threads. Once we introduce these quotas people set them and expect
> them
> > to
> >    be enforced (and if they aren't it may cause an outage). As a result
> > they
> >    are a bit more sensitive than normal configs, I think. The current
> > thread
> >    pools seem like something of an implementation detail and not the
> level
> > the
> >    user-facing quotas should be involved with. I think it might be better
> > to
> >    make this a general request-time throttle with no mention in the
> naming
> >    about I/O threads and simply acknowledge the current limitation (which
> > we
> >    may someday fix) in the docs that this covers only the time after the
> >    thread is read off the network.
> >    3. As such I think the right interface to the user would be something
> >    like percent_request_time and be in {0,...100} or request_time_ratio
> > and be
> >    in {0.0,...,1.0} (I think "ratio" is the terminology we used if the
> > scale
> >    is between 0 and 1 in the other metrics, right?)
> >
> > -Jay
> >
> > On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > wrote:
> >
> > > Guozhang/Dong,
> > >
> > > Thank you for the feedback.
> > >
> > > Guozhang : I have updated the section on co-existence of byte rate and
> > > request time quotas.
> > >
> > > Dong: I hadn't added much detail to the metrics and sensors since they
> > are
> > > going to be very similar to the existing metrics and sensors. To avoid
> > > confusion, I have now added more detail. All metrics are in the group
> > > "quotaType" and all sensors have names starting with "quotaType" (where
> > > quotaType is Produce/Fetch/LeaderReplication/
> > > FollowerReplication/*IOThread*).
> > > So there will be no reuse of existing metrics/sensors. The new ones for
> > > request processing time based throttling will be completely independent
> > of
> > > existing metrics/sensors, but will be consistent in format.
> > >
> > > The existing throttle_time_ms field in produce/fetch responses will not
> > be
> > > impacted by this KIP. That will continue to return byte-rate based
> > > throttling times. In addition, a new field request_throttle_time_ms
> will
> > be
> > > added to return request quota based throttling times. These will be
> > exposed
> > > as new metrics on the client-side.
> > >
> > > Since all metrics and sensors are different for each type of quota, I
> > > believe there is already sufficient metrics to monitor throttling on
> both
> > > client and broker side for each type of throttling.
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <li...@gmail.com> wrote:
> > >
> > > > Hey Rajini,
> > > >
> > > > I think it makes a lot of sense to use io_thread_units as metric to
> > quota
> > > > user's traffic here. LGTM overall. I have some questions regarding
> > > sensors.
> > > >
> > > > - Can you be more specific in the KIP what sensors will be added? For
> > > > example, it will be useful to specify the name and attributes of
> these
> > > new
> > > > sensors.
> > > >
> > > > - We currently have throttle-time and queue-size for byte-rate based
> > > quota.
> > > > Are you going to have separate throttle-time and queue-size for
> > requests
> > > > throttled by io_thread_unit-based quota, or will they share the same
> > > > sensor?
> > > >
> > > > - Does the throttle-time in the ProduceResponse and FetchResponse
> > > contains
> > > > time due to io_thread_unit-based quota?
> > > >
> > > > - Currently kafka server doesn't not provide any log or metrics that
> > > tells
> > > > whether any given clientId (or user) is throttled. This is not too
> bad
> > > > because we can still check the client-side byte-rate metric to
> validate
> > > > whether a given client is throttled. But with this io_thread_unit,
> > there
> > > > will be no way to validate whether a given client is slow because it
> > has
> > > > exceeded its io_thread_unit limit. It is necessary for user to be
> able
> > to
> > > > know this information to figure how whether they have reached there
> > quota
> > > > limit. How about we add log4j log on the server side to periodically
> > > print
> > > > the (client_id, byte-rate-throttle-time,
> io-thread-unit-throttle-time)
> > so
> > > > that kafka administrator can figure those users that have reached
> their
> > > > limit and act accordingly?
> > > >
> > > > Thanks,
> > > > Dong
> > > >
> > > >
> > > >
> > > >
> > > >
> > > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > >
> > > > > Made a pass over the doc, overall LGTM except a minor comment on
> the
> > > > > throttling implementation:
> > > > >
> > > > > Stated as "Request processing time throttling will be applied on
> top
> > if
> > > > > necessary." I thought that it meant the request processing time
> > > > throttling
> > > > > is applied first, but continue reading I found it actually meant to
> > > apply
> > > > > produce / fetch byte rate throttling first.
> > > > >
> > > > > Also the last sentence "The remaining delay if any is applied to
> the
> > > > > response." is a bit confusing to me. Maybe rewording it a bit?
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > >
> > > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <ju...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Rajini,
> > > > > >
> > > > > > Thanks for the updated KIP. The latest proposal looks good to me.
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Jun/Roger,
> > > > > > >
> > > > > > > Thank you for the feedback.
> > > > > > >
> > > > > > > 1. I have updated the KIP to use absolute units instead of
> > > > percentage.
> > > > > > The
> > > > > > > property is called* io_thread_units* to align with the thread
> > count
> > > > > > > property *num.io.threads*. When we implement network thread
> > > > utilization
> > > > > > > quotas, we can add another property *network_thread_units.*
> > > > > > >
> > > > > > > 2. ControlledShutdown is already listed under the exempt
> > requests.
> > > > Jun,
> > > > > > did
> > > > > > > you mean a different request that needs to be added? The four
> > > > requests
> > > > > > > currently exempt in the KIP are StopReplica,
> ControlledShutdown,
> > > > > > > LeaderAndIsr and UpdateMetadata. These are controlled using
> > > > > ClusterAction
> > > > > > > ACL, so it is easy to exclude and only throttle if
> unauthorized.
> > I
> > > > > wasn't
> > > > > > > sure if there are other requests used only for inter-broker
> that
> > > > needed
> > > > > > to
> > > > > > > be excluded.
> > > > > > >
> > > > > > > 3. I was thinking the smallest change would be to replace all
> > > > > references
> > > > > > to
> > > > > > > *requestChannel.sendResponse()* with a local method
> > > > > > > *sendResponseMaybeThrottle()* that does the throttling if any
> > plus
> > > > send
> > > > > > > response. If we throttle first in *KafkaApis.handle()*, the
> time
> > > > spent
> > > > > > > within the method handling the request will not be recorded or
> > used
> > > > in
> > > > > > > throttling. We can look into this again when the PR is ready
> for
> > > > > review.
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > Rajini
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > > > roger.hoover@gmail.com>
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Great to see this KIP and the excellent discussion.
> > > > > > > >
> > > > > > > > To me, Jun's suggestion makes sense.  If my application is
> > > > allocated
> > > > > 1
> > > > > > > > request handler unit, then it's as if I have a Kafka broker
> > with
> > > a
> > > > > > single
> > > > > > > > request handler thread dedicated to me.  That's the most I
> can
> > > use,
> > > > > at
> > > > > > > > least.  That allocation doesn't change even if an admin later
> > > > > increases
> > > > > > > the
> > > > > > > > size of the request thread pool on the broker.  It's similar
> to
> > > the
> > > > > CPU
> > > > > > > > abstraction that VMs and containers get from hypervisors or
> OS
> > > > > > > schedulers.
> > > > > > > > While different client access patterns can use wildly
> different
> > > > > amounts
> > > > > > > of
> > > > > > > > request thread resources per request, a given application
> will
> > > > > > generally
> > > > > > > > have a stable access pattern and can figure out empirically
> how
> > > > many
> > > > > > > > "request thread units" it needs to meet it's
> throughput/latency
> > > > > goals.
> > > > > > > >
> > > > > > > > Cheers,
> > > > > > > >
> > > > > > > > Roger
> > > > > > > >
> > > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <ju...@confluent.io>
> > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Rajini,
> > > > > > > > >
> > > > > > > > > Thanks for the updated KIP. A few more comments.
> > > > > > > > >
> > > > > > > > > 1. A concern of request_time_percent is that it's not an
> > > absolute
> > > > > > > value.
> > > > > > > > > Let's say you give a user a 10% limit. If the admin doubles
> > the
> > > > > > number
> > > > > > > of
> > > > > > > > > request handler threads, that user now actually has twice
> the
> > > > > > absolute
> > > > > > > > > capacity. This may confuse people a bit. So, perhaps
> setting
> > > the
> > > > > > quota
> > > > > > > > > based on an absolute request thread unit is better.
> > > > > > > > >
> > > > > > > > > 2. ControlledShutdownRequest is also an inter-broker
> request
> > > and
> > > > > > needs
> > > > > > > to
> > > > > > > > > be excluded from throttling.
> > > > > > > > >
> > > > > > > > > 3. Implementation wise, I am wondering if it's simpler to
> > apply
> > > > the
> > > > > > > > request
> > > > > > > > > time throttling first in KafkaApis.handle(). Otherwise, we
> > will
> > > > > need
> > > > > > to
> > > > > > > > add
> > > > > > > > > the throttling logic in each type of request.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> > > > > > > rajinisivaram@gmail.com
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Jun,
> > > > > > > > > >
> > > > > > > > > > Thank you for the review.
> > > > > > > > > >
> > > > > > > > > > I have reverted to the original KIP that throttles based
> on
> > > > > request
> > > > > > > > > handler
> > > > > > > > > > utilization. At the moment, it uses percentage, but I am
> > > happy
> > > > to
> > > > > > > > change
> > > > > > > > > to
> > > > > > > > > > a fraction (out of 1 instead of 100) if required. I have
> > > added
> > > > > the
> > > > > > > > > examples
> > > > > > > > > > from this discussion to the KIP. Also added a "Future
> Work"
> > > > > section
> > > > > > > to
> > > > > > > > > > address network thread utilization. The configuration is
> > > named
> > > > > > > > > > "request_time_percent" with the expectation that it can
> > also
> > > be
> > > > > > used
> > > > > > > as
> > > > > > > > > the
> > > > > > > > > > limit for network thread utilization when that is
> > > implemented,
> > > > so
> > > > > > > that
> > > > > > > > > > users have to set only one config for the two and not
> have
> > to
> > > > > worry
> > > > > > > > about
> > > > > > > > > > the internal distribution of the work between the two
> > thread
> > > > > pools
> > > > > > in
> > > > > > > > > > Kafka.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > >
> > > > > > > > > > Rajini
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
> > jun@confluent.io>
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi, Rajini,
> > > > > > > > > > >
> > > > > > > > > > > Thanks for the proposal.
> > > > > > > > > > >
> > > > > > > > > > > The benefit of using the request processing time over
> the
> > > > > request
> > > > > > > > rate
> > > > > > > > > is
> > > > > > > > > > > exactly what people have said. I will just expand that
> a
> > > bit.
> > > > > > > > Consider
> > > > > > > > > > the
> > > > > > > > > > > following case. The producer sends a produce request
> > with a
> > > > > 10MB
> > > > > > > > > message
> > > > > > > > > > > but compressed to 100KB with gzip. The decompression of
> > the
> > > > > > message
> > > > > > > > on
> > > > > > > > > > the
> > > > > > > > > > > broker could take 10-15 seconds, during which time, a
> > > request
> > > > > > > handler
> > > > > > > > > > > thread is completely blocked. In this case, neither the
> > > > byte-in
> > > > > > > quota
> > > > > > > > > nor
> > > > > > > > > > > the request rate quota may be effective in protecting
> the
> > > > > broker.
> > > > > > > > > > Consider
> > > > > > > > > > > another case. A consumer group starts with 10 instances
> > and
> > > > > later
> > > > > > > on
> > > > > > > > > > > switches to 20 instances. The request rate will likely
> > > > double,
> > > > > > but
> > > > > > > > the
> > > > > > > > > > > actually load on the broker may not double since each
> > fetch
> > > > > > request
> > > > > > > > > only
> > > > > > > > > > > contains half of the partitions. Request rate quota may
> > not
> > > > be
> > > > > > easy
> > > > > > > > to
> > > > > > > > > > > configure in this case.
> > > > > > > > > > >
> > > > > > > > > > > What we really want is to be able to prevent a client
> > from
> > > > > using
> > > > > > > too
> > > > > > > > > much
> > > > > > > > > > > of the server side resources. In this particular KIP,
> > this
> > > > > > resource
> > > > > > > > is
> > > > > > > > > > the
> > > > > > > > > > > capacity of the request handler threads. I agree that
> it
> > > may
> > > > > not
> > > > > > be
> > > > > > > > > > > intuitive for the users to determine how to set the
> right
> > > > > limit.
> > > > > > > > > However,
> > > > > > > > > > > this is not completely new and has been done in the
> > > container
> > > > > > world
> > > > > > > > > > > already. For example, Linux cgroup (
> > > > https://access.redhat.com/
> > > > > > > > > > > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > > > > > > > > > > Resource_Management_Guide/sec-cpu.html) has the
> concept
> > of
> > > > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > > > which specifies the total amount of time in
> microseconds
> > > for
> > > > > > which
> > > > > > > > all
> > > > > > > > > > > tasks in a cgroup can run during a one second period.
> We
> > > can
> > > > > > > > > potentially
> > > > > > > > > > > model the request handler threads in a similar way. For
> > > > > example,
> > > > > > > each
> > > > > > > > > > > request handler thread can be 1 request handler unit
> and
> > > the
> > > > > > admin
> > > > > > > > can
> > > > > > > > > > > configure a limit on how many units (say 0.01) a client
> > can
> > > > > have.
> > > > > > > > > > >
> > > > > > > > > > > Regarding not throttling the internal broker to broker
> > > > > requests.
> > > > > > We
> > > > > > > > > could
> > > > > > > > > > > do that. Alternatively, we could just let the admin
> > > > configure a
> > > > > > > high
> > > > > > > > > > limit
> > > > > > > > > > > for the kafka user (it may not be able to do that
> easily
> > > > based
> > > > > on
> > > > > > > > > > clientId
> > > > > > > > > > > though).
> > > > > > > > > > >
> > > > > > > > > > > Ideally we want to be able to protect the utilization
> of
> > > the
> > > > > > > network
> > > > > > > > > > thread
> > > > > > > > > > > pool too. The difficult is mostly what Rajini said: (1)
> > The
> > > > > > > mechanism
> > > > > > > > > for
> > > > > > > > > > > throttling the requests is through Purgatory and we
> will
> > > have
> > > > > to
> > > > > > > > think
> > > > > > > > > > > through how to integrate that into the network layer.
> > (2)
> > > In
> > > > > the
> > > > > > > > > network
> > > > > > > > > > > layer, currently we know the user, but not the clientId
> > of
> > > > the
> > > > > > > > request.
> > > > > > > > > > So,
> > > > > > > > > > > it's a bit tricky to throttle based on clientId there.
> > > Plus,
> > > > > the
> > > > > > > > > byteOut
> > > > > > > > > > > quota can already protect the network thread
> utilization
> > > for
> > > > > > fetch
> > > > > > > > > > > requests. So, if we can't figure out this part right
> now,
> > > > just
> > > > > > > > focusing
> > > > > > > > > > on
> > > > > > > > > > > the request handling threads for this KIP is still a
> > useful
> > > > > > > feature.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jun
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Thank you all for the feedback.
> > > > > > > > > > > >
> > > > > > > > > > > > Jay: I have removed exemption for consumer heartbeat
> > etc.
> > > > > Agree
> > > > > > > > that
> > > > > > > > > > > > protecting the cluster is more important than
> > protecting
> > > > > > > individual
> > > > > > > > > > apps.
> > > > > > > > > > > > Have retained the exemption for
> > StopReplicat/LeaderAndIsr
> > > > > etc,
> > > > > > > > these
> > > > > > > > > > are
> > > > > > > > > > > > throttled only if authorization fails (so can't be
> used
> > > for
> > > > > DoS
> > > > > > > > > attacks
> > > > > > > > > > > in
> > > > > > > > > > > > a secure cluster, but allows inter-broker requests to
> > > > > complete
> > > > > > > > > without
> > > > > > > > > > > > delays).
> > > > > > > > > > > >
> > > > > > > > > > > > I will wait another day to see if these is any
> > objection
> > > to
> > > > > > > quotas
> > > > > > > > > > based
> > > > > > > > > > > on
> > > > > > > > > > > > request processing time (as opposed to request rate)
> > and
> > > if
> > > > > > there
> > > > > > > > are
> > > > > > > > > > no
> > > > > > > > > > > > objections, I will revert to the original proposal
> with
> > > > some
> > > > > > > > changes.
> > > > > > > > > > > >
> > > > > > > > > > > > The original proposal was only including the time
> used
> > by
> > > > the
> > > > > > > > request
> > > > > > > > > > > > handler threads (that made calculation easy). I think
> > the
> > > > > > > > suggestion
> > > > > > > > > is
> > > > > > > > > > > to
> > > > > > > > > > > > include the time spent in the network threads as well
> > > since
> > > > > > that
> > > > > > > > may
> > > > > > > > > be
> > > > > > > > > > > > significant. As Jay pointed out, it is more
> complicated
> > > to
> > > > > > > > calculate
> > > > > > > > > > the
> > > > > > > > > > > > total available CPU time and convert to a ratio when
> > > there
> > > > > *m*
> > > > > > > I/O
> > > > > > > > > > > threads
> > > > > > > > > > > > and *n* network threads.
> ThreadMXBean#getThreadCPUTime(
> > )
> > > > may
> > > > > > > give
> > > > > > > > us
> > > > > > > > > > > what
> > > > > > > > > > > > we want, but it can be very expensive on some
> > platforms.
> > > As
> > > > > > > Becket
> > > > > > > > > and
> > > > > > > > > > > > Guozhang have pointed out, we do have several time
> > > > > measurements
> > > > > > > > > already
> > > > > > > > > > > for
> > > > > > > > > > > > generating metrics that we could use, though we might
> > > want
> > > > to
> > > > > > > > switch
> > > > > > > > > to
> > > > > > > > > > > > nanoTime() instead of currentTimeMillis() since some
> of
> > > the
> > > > > > > values
> > > > > > > > > for
> > > > > > > > > > > > small requests may be < 1ms. But rather than add up
> the
> > > > time
> > > > > > > spent
> > > > > > > > in
> > > > > > > > > > I/O
> > > > > > > > > > > > thread and network thread, wouldn't it be better to
> > > convert
> > > > > the
> > > > > > > > time
> > > > > > > > > > > spent
> > > > > > > > > > > > on each thread into a separate ratio? UserA has a
> > request
> > > > > quota
> > > > > > > of
> > > > > > > > > 5%.
> > > > > > > > > > > Can
> > > > > > > > > > > > we take that to mean that UserA can use 5% of the
> time
> > on
> > > > > > network
> > > > > > > > > > threads
> > > > > > > > > > > > and 5% of the time on I/O threads? If either is
> > exceeded,
> > > > the
> > > > > > > > > response
> > > > > > > > > > is
> > > > > > > > > > > > throttled - it would mean maintaining two sets of
> > metrics
> > > > for
> > > > > > the
> > > > > > > > two
> > > > > > > > > > > > durations, but would result in more meaningful
> ratios.
> > We
> > > > > could
> > > > > > > > > define
> > > > > > > > > > > two
> > > > > > > > > > > > quota limits (UserA has 5% of request threads and 10%
> > of
> > > > > > network
> > > > > > > > > > > threads),
> > > > > > > > > > > > but that seems unnecessary and harder to explain to
> > > users.
> > > > > > > > > > > >
> > > > > > > > > > > > Back to why and how quotas are applied to network
> > thread
> > > > > > > > utilization:
> > > > > > > > > > > > a) In the case of fetch,  the time spent in the
> network
> > > > > thread
> > > > > > > may
> > > > > > > > be
> > > > > > > > > > > > significant and I can see the need to include this.
> Are
> > > > there
> > > > > > > other
> > > > > > > > > > > > requests where the network thread utilization is
> > > > significant?
> > > > > > In
> > > > > > > > the
> > > > > > > > > > case
> > > > > > > > > > > > of fetch, request handler thread utilization would
> > > throttle
> > > > > > > clients
> > > > > > > > > > with
> > > > > > > > > > > > high request rate, low data volume and fetch byte
> rate
> > > > quota
> > > > > > will
> > > > > > > > > > > throttle
> > > > > > > > > > > > clients with high data volume. Network thread
> > utilization
> > > > is
> > > > > > > > perhaps
> > > > > > > > > > > > proportional to the data volume. I am wondering if we
> > > even
> > > > > need
> > > > > > > to
> > > > > > > > > > > throttle
> > > > > > > > > > > > based on network thread utilization or whether the
> data
> > > > > volume
> > > > > > > > quota
> > > > > > > > > > > covers
> > > > > > > > > > > > this case.
> > > > > > > > > > > >
> > > > > > > > > > > > b) At the moment, we record and check for quota
> > violation
> > > > at
> > > > > > the
> > > > > > > > same
> > > > > > > > > > > time.
> > > > > > > > > > > > If a quota is violated, the response is delayed.
> Using
> > > > Jay'e
> > > > > > > > example
> > > > > > > > > of
> > > > > > > > > > > > disk reads for fetches happening in the network
> thread,
> > > We
> > > > > > can't
> > > > > > > > > record
> > > > > > > > > > > and
> > > > > > > > > > > > delay a response after the disk reads. We could
> record
> > > the
> > > > > time
> > > > > > > > spent
> > > > > > > > > > on
> > > > > > > > > > > > the network thread when the response is complete and
> > > > > introduce
> > > > > > a
> > > > > > > > > delay
> > > > > > > > > > > for
> > > > > > > > > > > > handling a subsequent request (separate out recording
> > and
> > > > > quota
> > > > > > > > > > violation
> > > > > > > > > > > > handling in the case of network thread overload).
> Does
> > > that
> > > > > > make
> > > > > > > > > sense?
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > >
> > > > > > > > > > > > Rajini
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hey Jay,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Yeah, I agree that enforcing the CPU time is a
> little
> > > > > > tricky. I
> > > > > > > > am
> > > > > > > > > > > > thinking
> > > > > > > > > > > > > that maybe we can use the existing request
> > statistics.
> > > > They
> > > > > > are
> > > > > > > > > > already
> > > > > > > > > > > > > very detailed so we can probably see the
> approximate
> > > CPU
> > > > > time
> > > > > > > > from
> > > > > > > > > > it,
> > > > > > > > > > > > e.g.
> > > > > > > > > > > > > something like (total_time -
> > > request/response_queue_time
> > > > -
> > > > > > > > > > > remote_time).
> > > > > > > > > > > > >
> > > > > > > > > > > > > I agree with Guozhang that when a user is throttled
> > it
> > > is
> > > > > > > likely
> > > > > > > > > that
> > > > > > > > > > > we
> > > > > > > > > > > > > need to see if anything has went wrong first, and
> if
> > > the
> > > > > > users
> > > > > > > > are
> > > > > > > > > > well
> > > > > > > > > > > > > behaving and just need more resources, we will have
> > to
> > > > bump
> > > > > > up
> > > > > > > > the
> > > > > > > > > > > quota
> > > > > > > > > > > > > for them. It is true that pre-allocating CPU time
> > quota
> > > > > > > precisely
> > > > > > > > > for
> > > > > > > > > > > the
> > > > > > > > > > > > > users is difficult. So in practice it would
> probably
> > be
> > > > > more
> > > > > > > like
> > > > > > > > > > first
> > > > > > > > > > > > set
> > > > > > > > > > > > > a relative high protective CPU time quota for
> > everyone
> > > > and
> > > > > > > > increase
> > > > > > > > > > > that
> > > > > > > > > > > > > for some individual clients on demand.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <
> > > > > > > > wangguoz@gmail.com
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > This is a great proposal, glad to see it
> happening.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I am inclined to the CPU throttling, or more
> > > > specifically
> > > > > > > > > > processing
> > > > > > > > > > > > time
> > > > > > > > > > > > > > ratio instead of the request rate throttling as
> > well.
> > > > > > Becket
> > > > > > > > has
> > > > > > > > > > very
> > > > > > > > > > > > > well
> > > > > > > > > > > > > > summed my rationales above, and one thing to add
> > here
> > > > is
> > > > > > that
> > > > > > > > the
> > > > > > > > > > > > former
> > > > > > > > > > > > > > has a good support for both "protecting against
> > rogue
> > > > > > > clients"
> > > > > > > > as
> > > > > > > > > > > well
> > > > > > > > > > > > as
> > > > > > > > > > > > > > "utilizing a cluster for multi-tenancy usage":
> when
> > > > > > thinking
> > > > > > > > > about
> > > > > > > > > > > how
> > > > > > > > > > > > to
> > > > > > > > > > > > > > explain this to the end users, I find it actually
> > > more
> > > > > > > natural
> > > > > > > > > than
> > > > > > > > > > > the
> > > > > > > > > > > > > > request rate since as mentioned above, different
> > > > requests
> > > > > > > will
> > > > > > > > > have
> > > > > > > > > > > > quite
> > > > > > > > > > > > > > different "cost", and Kafka today already have
> > > various
> > > > > > > request
> > > > > > > > > > types
> > > > > > > > > > > > > > (produce, fetch, admin, metadata, etc), because
> of
> > > that
> > > > > the
> > > > > > > > > request
> > > > > > > > > > > > rate
> > > > > > > > > > > > > > throttling may not be as effective unless it is
> set
> > > > very
> > > > > > > > > > > > conservatively.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regarding to user reactions when they are
> > throttled,
> > > I
> > > > > > think
> > > > > > > it
> > > > > > > > > may
> > > > > > > > > > > > > differ
> > > > > > > > > > > > > > case-by-case, and need to be discovered / guided
> by
> > > > > looking
> > > > > > > at
> > > > > > > > > > > relative
> > > > > > > > > > > > > > metrics. So in other words users would not expect
> > to
> > > > get
> > > > > > > > > additional
> > > > > > > > > > > > > > information by simply being told "hey, you are
> > > > > throttled",
> > > > > > > > which
> > > > > > > > > is
> > > > > > > > > > > all
> > > > > > > > > > > > > > what throttling does; they need to take a
> follow-up
> > > > step
> > > > > > and
> > > > > > > > see
> > > > > > > > > > > "hmm,
> > > > > > > > > > > > > I'm
> > > > > > > > > > > > > > throttled probably because of ..", which is by
> > > looking
> > > > at
> > > > > > > other
> > > > > > > > > > > metric
> > > > > > > > > > > > > > values: e.g. whether I'm bombarding the brokers
> > with
> > > > > > metadata
> > > > > > > > > > > request,
> > > > > > > > > > > > > > which are usually cheap to handle but I'm sending
> > > > > thousands
> > > > > > > per
> > > > > > > > > > > second;
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > is it because I'm catching up and hence sending
> > very
> > > > > heavy
> > > > > > > > > fetching
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > with large min.bytes, etc.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regarding to the implementation, as once
> discussed
> > > with
> > > > > > Jun,
> > > > > > > > this
> > > > > > > > > > > seems
> > > > > > > > > > > > > not
> > > > > > > > > > > > > > very difficult since today we are already
> > collecting
> > > > the
> > > > > > > > "thread
> > > > > > > > > > pool
> > > > > > > > > > > > > > utilization" metrics, which is a single
> percentage
> > > > > > > > > > > "aggregateIdleMeter"
> > > > > > > > > > > > > > value; but we are already effectively aggregating
> > it
> > > > for
> > > > > > each
> > > > > > > > > > > requests
> > > > > > > > > > > > in
> > > > > > > > > > > > > > KafkaRequestHandler, and we can just extend it by
> > > > > recording
> > > > > > > the
> > > > > > > > > > > source
> > > > > > > > > > > > > > client id when handling them and aggregating by
> > > > clientId
> > > > > as
> > > > > > > > well
> > > > > > > > > as
> > > > > > > > > > > the
> > > > > > > > > > > > > > total aggregate.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Guozhang
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <
> > > > > > jay@confluent.io
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hey Becket/Rajini,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > When I thought about it more deeply I came
> around
> > > to
> > > > > the
> > > > > > > > > "percent
> > > > > > > > > > > of
> > > > > > > > > > > > > > > processing time" metric too. It seems a lot
> > closer
> > > to
> > > > > the
> > > > > > > > thing
> > > > > > > > > > we
> > > > > > > > > > > > > > actually
> > > > > > > > > > > > > > > care about and need to protect. I also think
> this
> > > > would
> > > > > > be
> > > > > > > a
> > > > > > > > > very
> > > > > > > > > > > > > useful
> > > > > > > > > > > > > > > metric even in the absence of throttling just
> to
> > > > debug
> > > > > > > whose
> > > > > > > > > > using
> > > > > > > > > > > > > > > capacity.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Two problems to consider:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >    1. I agree that for the user it is
> > > understandable
> > > > > what
> > > > > > > > lead
> > > > > > > > > to
> > > > > > > > > > > > their
> > > > > > > > > > > > > > >    being throttled, but it is a bit hard to
> > figure
> > > > out
> > > > > > the
> > > > > > > > safe
> > > > > > > > > > > range
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > >    them. i.e. if I have a new app that will
> send
> > > 200
> > > > > > > > > > messages/sec I
> > > > > > > > > > > > can
> > > > > > > > > > > > > > >    probably reason that I'll be under the
> > > throttling
> > > > > > limit
> > > > > > > of
> > > > > > > > > 300
> > > > > > > > > > > > > > req/sec.
> > > > > > > > > > > > > > >    However if I need to be under a 10% CPU
> > > resources
> > > > > > limit
> > > > > > > it
> > > > > > > > > may
> > > > > > > > > > > be
> > > > > > > > > > > > a
> > > > > > > > > > > > > > bit
> > > > > > > > > > > > > > >    harder for me to know a priori if i will or
> > > won't.
> > > > > > > > > > > > > > >    2. Calculating the available CPU time is a
> bit
> > > > > > difficult
> > > > > > > > > since
> > > > > > > > > > > > there
> > > > > > > > > > > > > > are
> > > > > > > > > > > > > > >    actually two thread pools--the I/O threads
> and
> > > the
> > > > > > > network
> > > > > > > > > > > > threads.
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > > think
> > > > > > > > > > > > > > >    it might be workable to count just the I/O
> > > thread
> > > > > time
> > > > > > > as
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > > > > proposal,
> > > > > > > > > > > > > > >    but the network thread work is actually
> > > > non-trivial
> > > > > > > (e.g.
> > > > > > > > > all
> > > > > > > > > > > the
> > > > > > > > > > > > > disk
> > > > > > > > > > > > > > >    reads for fetches happen in that thread). If
> > you
> > > > > count
> > > > > > > > both
> > > > > > > > > > the
> > > > > > > > > > > > > > network
> > > > > > > > > > > > > > > and
> > > > > > > > > > > > > > >    I/O threads it can skew things a bit. E.g.
> say
> > > you
> > > > > > have
> > > > > > > 50
> > > > > > > > > > > network
> > > > > > > > > > > > > > > threads,
> > > > > > > > > > > > > > >    10 I/O threads, and 8 cores, what is the
> > > available
> > > > > cpu
> > > > > > > > time
> > > > > > > > > > > > > available
> > > > > > > > > > > > > > > in a
> > > > > > > > > > > > > > >    second? I suppose this is a problem whenever
> > you
> > > > > have
> > > > > > a
> > > > > > > > > > > bottleneck
> > > > > > > > > > > > > > > between
> > > > > > > > > > > > > > >    I/O and network threads or if you end up
> > > > > significantly
> > > > > > > > > > > > > > over-provisioning
> > > > > > > > > > > > > > >    one pool (both of which are hard to avoid).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > An alternative for CPU throttling would be to
> use
> > > > this
> > > > > > api:
> > > > > > > > > > > > > > > http://docs.oracle.com/javase/
> > > > > 1.5.0/docs/api/java/lang/
> > > > > > > > > > > > > > > management/ThreadMXBean.html#
> > > getThreadCpuTime(long)
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > That would let you track actual CPU usage
> across
> > > the
> > > > > > > network,
> > > > > > > > > I/O
> > > > > > > > > > > > > > threads,
> > > > > > > > > > > > > > > and purgatory threads and look at it as a
> > > percentage
> > > > of
> > > > > > > total
> > > > > > > > > > > cores.
> > > > > > > > > > > > I
> > > > > > > > > > > > > > > think this fixes many problems in the
> reliability
> > > of
> > > > > the
> > > > > > > > > metric.
> > > > > > > > > > > It's
> > > > > > > > > > > > > > > meaning is slightly different as it is just CPU
> > > (you
> > > > > > don't
> > > > > > > > get
> > > > > > > > > > > > charged
> > > > > > > > > > > > > > for
> > > > > > > > > > > > > > > time blocking on I/O) but that may be okay
> > because
> > > we
> > > > > > > already
> > > > > > > > > > have
> > > > > > > > > > > a
> > > > > > > > > > > > > > > throttle on I/O. The downside is I think it is
> > > > possible
> > > > > > > this
> > > > > > > > > api
> > > > > > > > > > > can
> > > > > > > > > > > > be
> > > > > > > > > > > > > > > disabled or isn't always available and it may
> > also
> > > be
> > > > > > > > expensive
> > > > > > > > > > > (also
> > > > > > > > > > > > > > I've
> > > > > > > > > > > > > > > never used it so not sure if it really works
> the
> > > way
> > > > i
> > > > > > > > think).
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > -Jay
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> > > > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > If the purpose of the KIP is only to protect
> > the
> > > > > > cluster
> > > > > > > > from
> > > > > > > > > > > being
> > > > > > > > > > > > > > > > overwhelmed by crazy clients and is not
> > intended
> > > to
> > > > > > > address
> > > > > > > > > > > > resource
> > > > > > > > > > > > > > > > allocation problem among the clients, I am
> > > > wondering
> > > > > if
> > > > > > > > using
> > > > > > > > > > > > request
> > > > > > > > > > > > > > > > handling time quota (CPU time quota) is a
> > better
> > > > > > option.
> > > > > > > > Here
> > > > > > > > > > are
> > > > > > > > > > > > the
> > > > > > > > > > > > > > > > reasons:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 1. request handling time quota has better
> > > > protection.
> > > > > > Say
> > > > > > > > we
> > > > > > > > > > have
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > rate quota and set that to some value like
> 100
> > > > > > > > requests/sec,
> > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > > > > possible
> > > > > > > > > > > > > > > > that some of the requests are very expensive
> > > > actually
> > > > > > > take
> > > > > > > > a
> > > > > > > > > > lot
> > > > > > > > > > > of
> > > > > > > > > > > > > > time
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > handle. In that case a few clients may still
> > > > occupy a
> > > > > > lot
> > > > > > > > of
> > > > > > > > > > CPU
> > > > > > > > > > > > time
> > > > > > > > > > > > > > > even
> > > > > > > > > > > > > > > > the request rate is low. Arguably we can
> > > carefully
> > > > > set
> > > > > > > > > request
> > > > > > > > > > > rate
> > > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > for each request and client id combination,
> but
> > > it
> > > > > > could
> > > > > > > > > still
> > > > > > > > > > be
> > > > > > > > > > > > > > tricky
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > get it right for everyone.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > If we use the request time handling quota, we
> > can
> > > > > > simply
> > > > > > > > say
> > > > > > > > > no
> > > > > > > > > > > > > clients
> > > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > > take up to more than 30% of the total request
> > > > > handling
> > > > > > > > > capacity
> > > > > > > > > > > > > > (measured
> > > > > > > > > > > > > > > > by time), regardless of the difference among
> > > > > different
> > > > > > > > > requests
> > > > > > > > > > > or
> > > > > > > > > > > > > what
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > the client doing. In this case maybe we can
> > quota
> > > > all
> > > > > > the
> > > > > > > > > > > requests
> > > > > > > > > > > > if
> > > > > > > > > > > > > > we
> > > > > > > > > > > > > > > > want to.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > 2. The main benefit of using request rate
> limit
> > > is
> > > > > that
> > > > > > > it
> > > > > > > > > > seems
> > > > > > > > > > > > more
> > > > > > > > > > > > > > > > intuitive. It is true that it is probably
> > easier
> > > to
> > > > > > > explain
> > > > > > > > > to
> > > > > > > > > > > the
> > > > > > > > > > > > > user
> > > > > > > > > > > > > > > > what does that mean. However, in practice it
> > > looks
> > > > > the
> > > > > > > > impact
> > > > > > > > > > of
> > > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > rate quota is not more quantifiable than the
> > > > request
> > > > > > > > handling
> > > > > > > > > > > time
> > > > > > > > > > > > > > quota.
> > > > > > > > > > > > > > > > Unlike the byte rate quota, it is still
> > difficult
> > > > to
> > > > > > > give a
> > > > > > > > > > > number
> > > > > > > > > > > > > > about
> > > > > > > > > > > > > > > > impact of throughput or latency when a
> request
> > > rate
> > > > > > quota
> > > > > > > > is
> > > > > > > > > > hit.
> > > > > > > > > > > > So
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > not better than the request handling time
> > quota.
> > > In
> > > > > > fact
> > > > > > > I
> > > > > > > > > feel
> > > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > clearer to tell user that "you are limited
> > > because
> > > > > you
> > > > > > > have
> > > > > > > > > > taken
> > > > > > > > > > > > 30%
> > > > > > > > > > > > > > of
> > > > > > > > > > > > > > > > the CPU time on the broker" than otherwise
> > > > something
> > > > > > like
> > > > > > > > > "your
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > > rate quota on metadata request has reached".
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <
> > > > > > > > jay@confluent.io
> > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I think this proposal makes a lot of sense
> > > > > > (especially
> > > > > > > > now
> > > > > > > > > > that
> > > > > > > > > > > > it
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > > > > oriented around request rate) and fills the
> > > > biggest
> > > > > > > > > remaining
> > > > > > > > > > > gap
> > > > > > > > > > > > > in
> > > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > > multi-tenancy story.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I think for intra-cluster communication
> > > > > (StopReplica,
> > > > > > > > etc)
> > > > > > > > > we
> > > > > > > > > > > > could
> > > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > > throttling entirely. You can secure or
> > > otherwise
> > > > > > > > lock-down
> > > > > > > > > > the
> > > > > > > > > > > > > > cluster
> > > > > > > > > > > > > > > > > communication to avoid any unauthorized
> > > external
> > > > > > party
> > > > > > > > from
> > > > > > > > > > > > trying
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > initiate these requests. As a result we are
> > as
> > > > > likely
> > > > > > > to
> > > > > > > > > > cause
> > > > > > > > > > > > > > problems
> > > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > solve them by throttling these, right?
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I'm not so sure that we should exempt the
> > > > consumer
> > > > > > > > requests
> > > > > > > > > > > such
> > > > > > > > > > > > as
> > > > > > > > > > > > > > > > > heartbeat. It's true that if we throttle an
> > > app's
> > > > > > > > heartbeat
> > > > > > > > > > > > > requests
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > > cause it to fall out of its consumer group.
> > > > However
> > > > > > if
> > > > > > > we
> > > > > > > > > > don't
> > > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > > it may DDOS the cluster if the heartbeat
> > > interval
> > > > > is
> > > > > > > set
> > > > > > > > > > > > > incorrectly
> > > > > > > > > > > > > > or
> > > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > > some client in some language has a bug. I
> > think
> > > > the
> > > > > > > > policy
> > > > > > > > > > with
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > kind
> > > > > > > > > > > > > > > > > of throttling is to protect the cluster
> above
> > > any
> > > > > > > > > individual
> > > > > > > > > > > app,
> > > > > > > > > > > > > > > right?
> > > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > > think in general this should be okay since
> > for
> > > > most
> > > > > > > > > > deployments
> > > > > > > > > > > > > this
> > > > > > > > > > > > > > > > > setting is meant as more of a safety
> > > valve---that
> > > > > is
> > > > > > > > rather
> > > > > > > > > > > than
> > > > > > > > > > > > > set
> > > > > > > > > > > > > > > > > something very close to what you expect to
> > need
> > > > > (say
> > > > > > 2
> > > > > > > > > > req/sec
> > > > > > > > > > > or
> > > > > > > > > > > > > > > > whatever)
> > > > > > > > > > > > > > > > > you would have something quite high (like
> 100
> > > > > > req/sec)
> > > > > > > > with
> > > > > > > > > > > this
> > > > > > > > > > > > > > meant
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > prevent a client gone crazy. I think when
> > used
> > > > this
> > > > > > way
> > > > > > > > > > > allowing
> > > > > > > > > > > > > > those
> > > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > > be throttled would actually provide
> > meaningful
> > > > > > > > protection.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > -Jay
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini
> > > Sivaram <
> > > > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > I have just created KIP-124 to introduce
> > > > request
> > > > > > rate
> > > > > > > > > > quotas
> > > > > > > > > > > to
> > > > > > > > > > > > > > > Kafka:
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > https://cwiki.apache.org/
> > > > > > > confluence/display/KAFKA/KIP-
> > > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > The proposal is for a simple percentage
> > > request
> > > > > > > > handling
> > > > > > > > > > time
> > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > > can be allocated to *<client-id>*,
> *<user>*
> > > or
> > > > > > > *<user,
> > > > > > > > > > > > > client-id>*.
> > > > > > > > > > > > > > > > There
> > > > > > > > > > > > > > > > > > are a few other suggestions also under
> > > > "Rejected
> > > > > > > > > > > alternatives".
> > > > > > > > > > > > > > > > Feedback
> > > > > > > > > > > > > > > > > > and suggestions are welcome.
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Thank you...
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > --
> > > > > > > > > > > > > > -- Guozhang
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Ismael Juma <is...@juma.me.uk>.
Hi Jay,

Regarding 1, I definitely like the simplicity of keeping a single throttle
time field in the response. The downside is that the client metrics will be
more coarse grained.

Regarding 3, we have `leader.imbalance.per.broker.percentage` and
`log.cleaner.min.cleanable.ratio`.

Ismael

On Thu, Feb 23, 2017 at 4:43 PM, Jay Kreps <ja...@confluent.io> wrote:

> A few minor comments:
>
>    1. Isn't it the case that the throttling time response field should have
>    the total time your request was throttled irrespective of the quotas
> that
>    caused that. Limiting it to byte rate quota doesn't make sense, but I
> also
>    I don't think we want to end up adding new fields in the response for
> every
>    single thing we quota, right?
>    2. I don't think we should make this quota specifically about io
>    threads. Once we introduce these quotas people set them and expect them
> to
>    be enforced (and if they aren't it may cause an outage). As a result
> they
>    are a bit more sensitive than normal configs, I think. The current
> thread
>    pools seem like something of an implementation detail and not the level
> the
>    user-facing quotas should be involved with. I think it might be better
> to
>    make this a general request-time throttle with no mention in the naming
>    about I/O threads and simply acknowledge the current limitation (which
> we
>    may someday fix) in the docs that this covers only the time after the
>    thread is read off the network.
>    3. As such I think the right interface to the user would be something
>    like percent_request_time and be in {0,...100} or request_time_ratio
> and be
>    in {0.0,...,1.0} (I think "ratio" is the terminology we used if the
> scale
>    is between 0 and 1 in the other metrics, right?)
>
> -Jay
>
> On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Guozhang/Dong,
> >
> > Thank you for the feedback.
> >
> > Guozhang : I have updated the section on co-existence of byte rate and
> > request time quotas.
> >
> > Dong: I hadn't added much detail to the metrics and sensors since they
> are
> > going to be very similar to the existing metrics and sensors. To avoid
> > confusion, I have now added more detail. All metrics are in the group
> > "quotaType" and all sensors have names starting with "quotaType" (where
> > quotaType is Produce/Fetch/LeaderReplication/
> > FollowerReplication/*IOThread*).
> > So there will be no reuse of existing metrics/sensors. The new ones for
> > request processing time based throttling will be completely independent
> of
> > existing metrics/sensors, but will be consistent in format.
> >
> > The existing throttle_time_ms field in produce/fetch responses will not
> be
> > impacted by this KIP. That will continue to return byte-rate based
> > throttling times. In addition, a new field request_throttle_time_ms will
> be
> > added to return request quota based throttling times. These will be
> exposed
> > as new metrics on the client-side.
> >
> > Since all metrics and sensors are different for each type of quota, I
> > believe there is already sufficient metrics to monitor throttling on both
> > client and broker side for each type of throttling.
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <li...@gmail.com> wrote:
> >
> > > Hey Rajini,
> > >
> > > I think it makes a lot of sense to use io_thread_units as metric to
> quota
> > > user's traffic here. LGTM overall. I have some questions regarding
> > sensors.
> > >
> > > - Can you be more specific in the KIP what sensors will be added? For
> > > example, it will be useful to specify the name and attributes of these
> > new
> > > sensors.
> > >
> > > - We currently have throttle-time and queue-size for byte-rate based
> > quota.
> > > Are you going to have separate throttle-time and queue-size for
> requests
> > > throttled by io_thread_unit-based quota, or will they share the same
> > > sensor?
> > >
> > > - Does the throttle-time in the ProduceResponse and FetchResponse
> > contains
> > > time due to io_thread_unit-based quota?
> > >
> > > - Currently kafka server doesn't not provide any log or metrics that
> > tells
> > > whether any given clientId (or user) is throttled. This is not too bad
> > > because we can still check the client-side byte-rate metric to validate
> > > whether a given client is throttled. But with this io_thread_unit,
> there
> > > will be no way to validate whether a given client is slow because it
> has
> > > exceeded its io_thread_unit limit. It is necessary for user to be able
> to
> > > know this information to figure how whether they have reached there
> quota
> > > limit. How about we add log4j log on the server side to periodically
> > print
> > > the (client_id, byte-rate-throttle-time, io-thread-unit-throttle-time)
> so
> > > that kafka administrator can figure those users that have reached their
> > > limit and act accordingly?
> > >
> > > Thanks,
> > > Dong
> > >
> > >
> > >
> > >
> > >
> > > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > > > Made a pass over the doc, overall LGTM except a minor comment on the
> > > > throttling implementation:
> > > >
> > > > Stated as "Request processing time throttling will be applied on top
> if
> > > > necessary." I thought that it meant the request processing time
> > > throttling
> > > > is applied first, but continue reading I found it actually meant to
> > apply
> > > > produce / fetch byte rate throttling first.
> > > >
> > > > Also the last sentence "The remaining delay if any is applied to the
> > > > response." is a bit confusing to me. Maybe rewording it a bit?
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Hi, Rajini,
> > > > >
> > > > > Thanks for the updated KIP. The latest proposal looks good to me.
> > > > >
> > > > > Jun
> > > > >
> > > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > > rajinisivaram@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Jun/Roger,
> > > > > >
> > > > > > Thank you for the feedback.
> > > > > >
> > > > > > 1. I have updated the KIP to use absolute units instead of
> > > percentage.
> > > > > The
> > > > > > property is called* io_thread_units* to align with the thread
> count
> > > > > > property *num.io.threads*. When we implement network thread
> > > utilization
> > > > > > quotas, we can add another property *network_thread_units.*
> > > > > >
> > > > > > 2. ControlledShutdown is already listed under the exempt
> requests.
> > > Jun,
> > > > > did
> > > > > > you mean a different request that needs to be added? The four
> > > requests
> > > > > > currently exempt in the KIP are StopReplica, ControlledShutdown,
> > > > > > LeaderAndIsr and UpdateMetadata. These are controlled using
> > > > ClusterAction
> > > > > > ACL, so it is easy to exclude and only throttle if unauthorized.
> I
> > > > wasn't
> > > > > > sure if there are other requests used only for inter-broker that
> > > needed
> > > > > to
> > > > > > be excluded.
> > > > > >
> > > > > > 3. I was thinking the smallest change would be to replace all
> > > > references
> > > > > to
> > > > > > *requestChannel.sendResponse()* with a local method
> > > > > > *sendResponseMaybeThrottle()* that does the throttling if any
> plus
> > > send
> > > > > > response. If we throttle first in *KafkaApis.handle()*, the time
> > > spent
> > > > > > within the method handling the request will not be recorded or
> used
> > > in
> > > > > > throttling. We can look into this again when the PR is ready for
> > > > review.
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Rajini
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > > roger.hoover@gmail.com>
> > > > > > wrote:
> > > > > >
> > > > > > > Great to see this KIP and the excellent discussion.
> > > > > > >
> > > > > > > To me, Jun's suggestion makes sense.  If my application is
> > > allocated
> > > > 1
> > > > > > > request handler unit, then it's as if I have a Kafka broker
> with
> > a
> > > > > single
> > > > > > > request handler thread dedicated to me.  That's the most I can
> > use,
> > > > at
> > > > > > > least.  That allocation doesn't change even if an admin later
> > > > increases
> > > > > > the
> > > > > > > size of the request thread pool on the broker.  It's similar to
> > the
> > > > CPU
> > > > > > > abstraction that VMs and containers get from hypervisors or OS
> > > > > > schedulers.
> > > > > > > While different client access patterns can use wildly different
> > > > amounts
> > > > > > of
> > > > > > > request thread resources per request, a given application will
> > > > > generally
> > > > > > > have a stable access pattern and can figure out empirically how
> > > many
> > > > > > > "request thread units" it needs to meet it's throughput/latency
> > > > goals.
> > > > > > >
> > > > > > > Cheers,
> > > > > > >
> > > > > > > Roger
> > > > > > >
> > > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > > > > >
> > > > > > > > Hi, Rajini,
> > > > > > > >
> > > > > > > > Thanks for the updated KIP. A few more comments.
> > > > > > > >
> > > > > > > > 1. A concern of request_time_percent is that it's not an
> > absolute
> > > > > > value.
> > > > > > > > Let's say you give a user a 10% limit. If the admin doubles
> the
> > > > > number
> > > > > > of
> > > > > > > > request handler threads, that user now actually has twice the
> > > > > absolute
> > > > > > > > capacity. This may confuse people a bit. So, perhaps setting
> > the
> > > > > quota
> > > > > > > > based on an absolute request thread unit is better.
> > > > > > > >
> > > > > > > > 2. ControlledShutdownRequest is also an inter-broker request
> > and
> > > > > needs
> > > > > > to
> > > > > > > > be excluded from throttling.
> > > > > > > >
> > > > > > > > 3. Implementation wise, I am wondering if it's simpler to
> apply
> > > the
> > > > > > > request
> > > > > > > > time throttling first in KafkaApis.handle(). Otherwise, we
> will
> > > > need
> > > > > to
> > > > > > > add
> > > > > > > > the throttling logic in each type of request.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> > > > > > rajinisivaram@gmail.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Jun,
> > > > > > > > >
> > > > > > > > > Thank you for the review.
> > > > > > > > >
> > > > > > > > > I have reverted to the original KIP that throttles based on
> > > > request
> > > > > > > > handler
> > > > > > > > > utilization. At the moment, it uses percentage, but I am
> > happy
> > > to
> > > > > > > change
> > > > > > > > to
> > > > > > > > > a fraction (out of 1 instead of 100) if required. I have
> > added
> > > > the
> > > > > > > > examples
> > > > > > > > > from this discussion to the KIP. Also added a "Future Work"
> > > > section
> > > > > > to
> > > > > > > > > address network thread utilization. The configuration is
> > named
> > > > > > > > > "request_time_percent" with the expectation that it can
> also
> > be
> > > > > used
> > > > > > as
> > > > > > > > the
> > > > > > > > > limit for network thread utilization when that is
> > implemented,
> > > so
> > > > > > that
> > > > > > > > > users have to set only one config for the two and not have
> to
> > > > worry
> > > > > > > about
> > > > > > > > > the internal distribution of the work between the two
> thread
> > > > pools
> > > > > in
> > > > > > > > > Kafka.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > >
> > > > > > > > > Rajini
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <
> jun@confluent.io>
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi, Rajini,
> > > > > > > > > >
> > > > > > > > > > Thanks for the proposal.
> > > > > > > > > >
> > > > > > > > > > The benefit of using the request processing time over the
> > > > request
> > > > > > > rate
> > > > > > > > is
> > > > > > > > > > exactly what people have said. I will just expand that a
> > bit.
> > > > > > > Consider
> > > > > > > > > the
> > > > > > > > > > following case. The producer sends a produce request
> with a
> > > > 10MB
> > > > > > > > message
> > > > > > > > > > but compressed to 100KB with gzip. The decompression of
> the
> > > > > message
> > > > > > > on
> > > > > > > > > the
> > > > > > > > > > broker could take 10-15 seconds, during which time, a
> > request
> > > > > > handler
> > > > > > > > > > thread is completely blocked. In this case, neither the
> > > byte-in
> > > > > > quota
> > > > > > > > nor
> > > > > > > > > > the request rate quota may be effective in protecting the
> > > > broker.
> > > > > > > > > Consider
> > > > > > > > > > another case. A consumer group starts with 10 instances
> and
> > > > later
> > > > > > on
> > > > > > > > > > switches to 20 instances. The request rate will likely
> > > double,
> > > > > but
> > > > > > > the
> > > > > > > > > > actually load on the broker may not double since each
> fetch
> > > > > request
> > > > > > > > only
> > > > > > > > > > contains half of the partitions. Request rate quota may
> not
> > > be
> > > > > easy
> > > > > > > to
> > > > > > > > > > configure in this case.
> > > > > > > > > >
> > > > > > > > > > What we really want is to be able to prevent a client
> from
> > > > using
> > > > > > too
> > > > > > > > much
> > > > > > > > > > of the server side resources. In this particular KIP,
> this
> > > > > resource
> > > > > > > is
> > > > > > > > > the
> > > > > > > > > > capacity of the request handler threads. I agree that it
> > may
> > > > not
> > > > > be
> > > > > > > > > > intuitive for the users to determine how to set the right
> > > > limit.
> > > > > > > > However,
> > > > > > > > > > this is not completely new and has been done in the
> > container
> > > > > world
> > > > > > > > > > already. For example, Linux cgroup (
> > > https://access.redhat.com/
> > > > > > > > > > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > > > > > > > > > Resource_Management_Guide/sec-cpu.html) has the concept
> of
> > > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > > which specifies the total amount of time in microseconds
> > for
> > > > > which
> > > > > > > all
> > > > > > > > > > tasks in a cgroup can run during a one second period. We
> > can
> > > > > > > > potentially
> > > > > > > > > > model the request handler threads in a similar way. For
> > > > example,
> > > > > > each
> > > > > > > > > > request handler thread can be 1 request handler unit and
> > the
> > > > > admin
> > > > > > > can
> > > > > > > > > > configure a limit on how many units (say 0.01) a client
> can
> > > > have.
> > > > > > > > > >
> > > > > > > > > > Regarding not throttling the internal broker to broker
> > > > requests.
> > > > > We
> > > > > > > > could
> > > > > > > > > > do that. Alternatively, we could just let the admin
> > > configure a
> > > > > > high
> > > > > > > > > limit
> > > > > > > > > > for the kafka user (it may not be able to do that easily
> > > based
> > > > on
> > > > > > > > > clientId
> > > > > > > > > > though).
> > > > > > > > > >
> > > > > > > > > > Ideally we want to be able to protect the utilization of
> > the
> > > > > > network
> > > > > > > > > thread
> > > > > > > > > > pool too. The difficult is mostly what Rajini said: (1)
> The
> > > > > > mechanism
> > > > > > > > for
> > > > > > > > > > throttling the requests is through Purgatory and we will
> > have
> > > > to
> > > > > > > think
> > > > > > > > > > through how to integrate that into the network layer.
> (2)
> > In
> > > > the
> > > > > > > > network
> > > > > > > > > > layer, currently we know the user, but not the clientId
> of
> > > the
> > > > > > > request.
> > > > > > > > > So,
> > > > > > > > > > it's a bit tricky to throttle based on clientId there.
> > Plus,
> > > > the
> > > > > > > > byteOut
> > > > > > > > > > quota can already protect the network thread utilization
> > for
> > > > > fetch
> > > > > > > > > > requests. So, if we can't figure out this part right now,
> > > just
> > > > > > > focusing
> > > > > > > > > on
> > > > > > > > > > the request handling threads for this KIP is still a
> useful
> > > > > > feature.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jun
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Thank you all for the feedback.
> > > > > > > > > > >
> > > > > > > > > > > Jay: I have removed exemption for consumer heartbeat
> etc.
> > > > Agree
> > > > > > > that
> > > > > > > > > > > protecting the cluster is more important than
> protecting
> > > > > > individual
> > > > > > > > > apps.
> > > > > > > > > > > Have retained the exemption for
> StopReplicat/LeaderAndIsr
> > > > etc,
> > > > > > > these
> > > > > > > > > are
> > > > > > > > > > > throttled only if authorization fails (so can't be used
> > for
> > > > DoS
> > > > > > > > attacks
> > > > > > > > > > in
> > > > > > > > > > > a secure cluster, but allows inter-broker requests to
> > > > complete
> > > > > > > > without
> > > > > > > > > > > delays).
> > > > > > > > > > >
> > > > > > > > > > > I will wait another day to see if these is any
> objection
> > to
> > > > > > quotas
> > > > > > > > > based
> > > > > > > > > > on
> > > > > > > > > > > request processing time (as opposed to request rate)
> and
> > if
> > > > > there
> > > > > > > are
> > > > > > > > > no
> > > > > > > > > > > objections, I will revert to the original proposal with
> > > some
> > > > > > > changes.
> > > > > > > > > > >
> > > > > > > > > > > The original proposal was only including the time used
> by
> > > the
> > > > > > > request
> > > > > > > > > > > handler threads (that made calculation easy). I think
> the
> > > > > > > suggestion
> > > > > > > > is
> > > > > > > > > > to
> > > > > > > > > > > include the time spent in the network threads as well
> > since
> > > > > that
> > > > > > > may
> > > > > > > > be
> > > > > > > > > > > significant. As Jay pointed out, it is more complicated
> > to
> > > > > > > calculate
> > > > > > > > > the
> > > > > > > > > > > total available CPU time and convert to a ratio when
> > there
> > > > *m*
> > > > > > I/O
> > > > > > > > > > threads
> > > > > > > > > > > and *n* network threads. ThreadMXBean#getThreadCPUTime(
> )
> > > may
> > > > > > give
> > > > > > > us
> > > > > > > > > > what
> > > > > > > > > > > we want, but it can be very expensive on some
> platforms.
> > As
> > > > > > Becket
> > > > > > > > and
> > > > > > > > > > > Guozhang have pointed out, we do have several time
> > > > measurements
> > > > > > > > already
> > > > > > > > > > for
> > > > > > > > > > > generating metrics that we could use, though we might
> > want
> > > to
> > > > > > > switch
> > > > > > > > to
> > > > > > > > > > > nanoTime() instead of currentTimeMillis() since some of
> > the
> > > > > > values
> > > > > > > > for
> > > > > > > > > > > small requests may be < 1ms. But rather than add up the
> > > time
> > > > > > spent
> > > > > > > in
> > > > > > > > > I/O
> > > > > > > > > > > thread and network thread, wouldn't it be better to
> > convert
> > > > the
> > > > > > > time
> > > > > > > > > > spent
> > > > > > > > > > > on each thread into a separate ratio? UserA has a
> request
> > > > quota
> > > > > > of
> > > > > > > > 5%.
> > > > > > > > > > Can
> > > > > > > > > > > we take that to mean that UserA can use 5% of the time
> on
> > > > > network
> > > > > > > > > threads
> > > > > > > > > > > and 5% of the time on I/O threads? If either is
> exceeded,
> > > the
> > > > > > > > response
> > > > > > > > > is
> > > > > > > > > > > throttled - it would mean maintaining two sets of
> metrics
> > > for
> > > > > the
> > > > > > > two
> > > > > > > > > > > durations, but would result in more meaningful ratios.
> We
> > > > could
> > > > > > > > define
> > > > > > > > > > two
> > > > > > > > > > > quota limits (UserA has 5% of request threads and 10%
> of
> > > > > network
> > > > > > > > > > threads),
> > > > > > > > > > > but that seems unnecessary and harder to explain to
> > users.
> > > > > > > > > > >
> > > > > > > > > > > Back to why and how quotas are applied to network
> thread
> > > > > > > utilization:
> > > > > > > > > > > a) In the case of fetch,  the time spent in the network
> > > > thread
> > > > > > may
> > > > > > > be
> > > > > > > > > > > significant and I can see the need to include this. Are
> > > there
> > > > > > other
> > > > > > > > > > > requests where the network thread utilization is
> > > significant?
> > > > > In
> > > > > > > the
> > > > > > > > > case
> > > > > > > > > > > of fetch, request handler thread utilization would
> > throttle
> > > > > > clients
> > > > > > > > > with
> > > > > > > > > > > high request rate, low data volume and fetch byte rate
> > > quota
> > > > > will
> > > > > > > > > > throttle
> > > > > > > > > > > clients with high data volume. Network thread
> utilization
> > > is
> > > > > > > perhaps
> > > > > > > > > > > proportional to the data volume. I am wondering if we
> > even
> > > > need
> > > > > > to
> > > > > > > > > > throttle
> > > > > > > > > > > based on network thread utilization or whether the data
> > > > volume
> > > > > > > quota
> > > > > > > > > > covers
> > > > > > > > > > > this case.
> > > > > > > > > > >
> > > > > > > > > > > b) At the moment, we record and check for quota
> violation
> > > at
> > > > > the
> > > > > > > same
> > > > > > > > > > time.
> > > > > > > > > > > If a quota is violated, the response is delayed. Using
> > > Jay'e
> > > > > > > example
> > > > > > > > of
> > > > > > > > > > > disk reads for fetches happening in the network thread,
> > We
> > > > > can't
> > > > > > > > record
> > > > > > > > > > and
> > > > > > > > > > > delay a response after the disk reads. We could record
> > the
> > > > time
> > > > > > > spent
> > > > > > > > > on
> > > > > > > > > > > the network thread when the response is complete and
> > > > introduce
> > > > > a
> > > > > > > > delay
> > > > > > > > > > for
> > > > > > > > > > > handling a subsequent request (separate out recording
> and
> > > > quota
> > > > > > > > > violation
> > > > > > > > > > > handling in the case of network thread overload). Does
> > that
> > > > > make
> > > > > > > > sense?
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > >
> > > > > > > > > > > Rajini
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> > > > > > becket.qin@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hey Jay,
> > > > > > > > > > > >
> > > > > > > > > > > > Yeah, I agree that enforcing the CPU time is a little
> > > > > tricky. I
> > > > > > > am
> > > > > > > > > > > thinking
> > > > > > > > > > > > that maybe we can use the existing request
> statistics.
> > > They
> > > > > are
> > > > > > > > > already
> > > > > > > > > > > > very detailed so we can probably see the approximate
> > CPU
> > > > time
> > > > > > > from
> > > > > > > > > it,
> > > > > > > > > > > e.g.
> > > > > > > > > > > > something like (total_time -
> > request/response_queue_time
> > > -
> > > > > > > > > > remote_time).
> > > > > > > > > > > >
> > > > > > > > > > > > I agree with Guozhang that when a user is throttled
> it
> > is
> > > > > > likely
> > > > > > > > that
> > > > > > > > > > we
> > > > > > > > > > > > need to see if anything has went wrong first, and if
> > the
> > > > > users
> > > > > > > are
> > > > > > > > > well
> > > > > > > > > > > > behaving and just need more resources, we will have
> to
> > > bump
> > > > > up
> > > > > > > the
> > > > > > > > > > quota
> > > > > > > > > > > > for them. It is true that pre-allocating CPU time
> quota
> > > > > > precisely
> > > > > > > > for
> > > > > > > > > > the
> > > > > > > > > > > > users is difficult. So in practice it would probably
> be
> > > > more
> > > > > > like
> > > > > > > > > first
> > > > > > > > > > > set
> > > > > > > > > > > > a relative high protective CPU time quota for
> everyone
> > > and
> > > > > > > increase
> > > > > > > > > > that
> > > > > > > > > > > > for some individual clients on demand.
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <
> > > > > > > wangguoz@gmail.com
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > This is a great proposal, glad to see it happening.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I am inclined to the CPU throttling, or more
> > > specifically
> > > > > > > > > processing
> > > > > > > > > > > time
> > > > > > > > > > > > > ratio instead of the request rate throttling as
> well.
> > > > > Becket
> > > > > > > has
> > > > > > > > > very
> > > > > > > > > > > > well
> > > > > > > > > > > > > summed my rationales above, and one thing to add
> here
> > > is
> > > > > that
> > > > > > > the
> > > > > > > > > > > former
> > > > > > > > > > > > > has a good support for both "protecting against
> rogue
> > > > > > clients"
> > > > > > > as
> > > > > > > > > > well
> > > > > > > > > > > as
> > > > > > > > > > > > > "utilizing a cluster for multi-tenancy usage": when
> > > > > thinking
> > > > > > > > about
> > > > > > > > > > how
> > > > > > > > > > > to
> > > > > > > > > > > > > explain this to the end users, I find it actually
> > more
> > > > > > natural
> > > > > > > > than
> > > > > > > > > > the
> > > > > > > > > > > > > request rate since as mentioned above, different
> > > requests
> > > > > > will
> > > > > > > > have
> > > > > > > > > > > quite
> > > > > > > > > > > > > different "cost", and Kafka today already have
> > various
> > > > > > request
> > > > > > > > > types
> > > > > > > > > > > > > (produce, fetch, admin, metadata, etc), because of
> > that
> > > > the
> > > > > > > > request
> > > > > > > > > > > rate
> > > > > > > > > > > > > throttling may not be as effective unless it is set
> > > very
> > > > > > > > > > > conservatively.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regarding to user reactions when they are
> throttled,
> > I
> > > > > think
> > > > > > it
> > > > > > > > may
> > > > > > > > > > > > differ
> > > > > > > > > > > > > case-by-case, and need to be discovered / guided by
> > > > looking
> > > > > > at
> > > > > > > > > > relative
> > > > > > > > > > > > > metrics. So in other words users would not expect
> to
> > > get
> > > > > > > > additional
> > > > > > > > > > > > > information by simply being told "hey, you are
> > > > throttled",
> > > > > > > which
> > > > > > > > is
> > > > > > > > > > all
> > > > > > > > > > > > > what throttling does; they need to take a follow-up
> > > step
> > > > > and
> > > > > > > see
> > > > > > > > > > "hmm,
> > > > > > > > > > > > I'm
> > > > > > > > > > > > > throttled probably because of ..", which is by
> > looking
> > > at
> > > > > > other
> > > > > > > > > > metric
> > > > > > > > > > > > > values: e.g. whether I'm bombarding the brokers
> with
> > > > > metadata
> > > > > > > > > > request,
> > > > > > > > > > > > > which are usually cheap to handle but I'm sending
> > > > thousands
> > > > > > per
> > > > > > > > > > second;
> > > > > > > > > > > > or
> > > > > > > > > > > > > is it because I'm catching up and hence sending
> very
> > > > heavy
> > > > > > > > fetching
> > > > > > > > > > > > request
> > > > > > > > > > > > > with large min.bytes, etc.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regarding to the implementation, as once discussed
> > with
> > > > > Jun,
> > > > > > > this
> > > > > > > > > > seems
> > > > > > > > > > > > not
> > > > > > > > > > > > > very difficult since today we are already
> collecting
> > > the
> > > > > > > "thread
> > > > > > > > > pool
> > > > > > > > > > > > > utilization" metrics, which is a single percentage
> > > > > > > > > > "aggregateIdleMeter"
> > > > > > > > > > > > > value; but we are already effectively aggregating
> it
> > > for
> > > > > each
> > > > > > > > > > requests
> > > > > > > > > > > in
> > > > > > > > > > > > > KafkaRequestHandler, and we can just extend it by
> > > > recording
> > > > > > the
> > > > > > > > > > source
> > > > > > > > > > > > > client id when handling them and aggregating by
> > > clientId
> > > > as
> > > > > > > well
> > > > > > > > as
> > > > > > > > > > the
> > > > > > > > > > > > > total aggregate.
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > Guozhang
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <
> > > > > jay@confluent.io
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hey Becket/Rajini,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > When I thought about it more deeply I came around
> > to
> > > > the
> > > > > > > > "percent
> > > > > > > > > > of
> > > > > > > > > > > > > > processing time" metric too. It seems a lot
> closer
> > to
> > > > the
> > > > > > > thing
> > > > > > > > > we
> > > > > > > > > > > > > actually
> > > > > > > > > > > > > > care about and need to protect. I also think this
> > > would
> > > > > be
> > > > > > a
> > > > > > > > very
> > > > > > > > > > > > useful
> > > > > > > > > > > > > > metric even in the absence of throttling just to
> > > debug
> > > > > > whose
> > > > > > > > > using
> > > > > > > > > > > > > > capacity.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Two problems to consider:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >    1. I agree that for the user it is
> > understandable
> > > > what
> > > > > > > lead
> > > > > > > > to
> > > > > > > > > > > their
> > > > > > > > > > > > > >    being throttled, but it is a bit hard to
> figure
> > > out
> > > > > the
> > > > > > > safe
> > > > > > > > > > range
> > > > > > > > > > > > for
> > > > > > > > > > > > > >    them. i.e. if I have a new app that will send
> > 200
> > > > > > > > > messages/sec I
> > > > > > > > > > > can
> > > > > > > > > > > > > >    probably reason that I'll be under the
> > throttling
> > > > > limit
> > > > > > of
> > > > > > > > 300
> > > > > > > > > > > > > req/sec.
> > > > > > > > > > > > > >    However if I need to be under a 10% CPU
> > resources
> > > > > limit
> > > > > > it
> > > > > > > > may
> > > > > > > > > > be
> > > > > > > > > > > a
> > > > > > > > > > > > > bit
> > > > > > > > > > > > > >    harder for me to know a priori if i will or
> > won't.
> > > > > > > > > > > > > >    2. Calculating the available CPU time is a bit
> > > > > difficult
> > > > > > > > since
> > > > > > > > > > > there
> > > > > > > > > > > > > are
> > > > > > > > > > > > > >    actually two thread pools--the I/O threads and
> > the
> > > > > > network
> > > > > > > > > > > threads.
> > > > > > > > > > > > I
> > > > > > > > > > > > > > think
> > > > > > > > > > > > > >    it might be workable to count just the I/O
> > thread
> > > > time
> > > > > > as
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > > > > proposal,
> > > > > > > > > > > > > >    but the network thread work is actually
> > > non-trivial
> > > > > > (e.g.
> > > > > > > > all
> > > > > > > > > > the
> > > > > > > > > > > > disk
> > > > > > > > > > > > > >    reads for fetches happen in that thread). If
> you
> > > > count
> > > > > > > both
> > > > > > > > > the
> > > > > > > > > > > > > network
> > > > > > > > > > > > > > and
> > > > > > > > > > > > > >    I/O threads it can skew things a bit. E.g. say
> > you
> > > > > have
> > > > > > 50
> > > > > > > > > > network
> > > > > > > > > > > > > > threads,
> > > > > > > > > > > > > >    10 I/O threads, and 8 cores, what is the
> > available
> > > > cpu
> > > > > > > time
> > > > > > > > > > > > available
> > > > > > > > > > > > > > in a
> > > > > > > > > > > > > >    second? I suppose this is a problem whenever
> you
> > > > have
> > > > > a
> > > > > > > > > > bottleneck
> > > > > > > > > > > > > > between
> > > > > > > > > > > > > >    I/O and network threads or if you end up
> > > > significantly
> > > > > > > > > > > > > over-provisioning
> > > > > > > > > > > > > >    one pool (both of which are hard to avoid).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > An alternative for CPU throttling would be to use
> > > this
> > > > > api:
> > > > > > > > > > > > > > http://docs.oracle.com/javase/
> > > > 1.5.0/docs/api/java/lang/
> > > > > > > > > > > > > > management/ThreadMXBean.html#
> > getThreadCpuTime(long)
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > That would let you track actual CPU usage across
> > the
> > > > > > network,
> > > > > > > > I/O
> > > > > > > > > > > > > threads,
> > > > > > > > > > > > > > and purgatory threads and look at it as a
> > percentage
> > > of
> > > > > > total
> > > > > > > > > > cores.
> > > > > > > > > > > I
> > > > > > > > > > > > > > think this fixes many problems in the reliability
> > of
> > > > the
> > > > > > > > metric.
> > > > > > > > > > It's
> > > > > > > > > > > > > > meaning is slightly different as it is just CPU
> > (you
> > > > > don't
> > > > > > > get
> > > > > > > > > > > charged
> > > > > > > > > > > > > for
> > > > > > > > > > > > > > time blocking on I/O) but that may be okay
> because
> > we
> > > > > > already
> > > > > > > > > have
> > > > > > > > > > a
> > > > > > > > > > > > > > throttle on I/O. The downside is I think it is
> > > possible
> > > > > > this
> > > > > > > > api
> > > > > > > > > > can
> > > > > > > > > > > be
> > > > > > > > > > > > > > disabled or isn't always available and it may
> also
> > be
> > > > > > > expensive
> > > > > > > > > > (also
> > > > > > > > > > > > > I've
> > > > > > > > > > > > > > never used it so not sure if it really works the
> > way
> > > i
> > > > > > > think).
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > -Jay
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> > > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If the purpose of the KIP is only to protect
> the
> > > > > cluster
> > > > > > > from
> > > > > > > > > > being
> > > > > > > > > > > > > > > overwhelmed by crazy clients and is not
> intended
> > to
> > > > > > address
> > > > > > > > > > > resource
> > > > > > > > > > > > > > > allocation problem among the clients, I am
> > > wondering
> > > > if
> > > > > > > using
> > > > > > > > > > > request
> > > > > > > > > > > > > > > handling time quota (CPU time quota) is a
> better
> > > > > option.
> > > > > > > Here
> > > > > > > > > are
> > > > > > > > > > > the
> > > > > > > > > > > > > > > reasons:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 1. request handling time quota has better
> > > protection.
> > > > > Say
> > > > > > > we
> > > > > > > > > have
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > rate quota and set that to some value like 100
> > > > > > > requests/sec,
> > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > > > > possible
> > > > > > > > > > > > > > > that some of the requests are very expensive
> > > actually
> > > > > > take
> > > > > > > a
> > > > > > > > > lot
> > > > > > > > > > of
> > > > > > > > > > > > > time
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > handle. In that case a few clients may still
> > > occupy a
> > > > > lot
> > > > > > > of
> > > > > > > > > CPU
> > > > > > > > > > > time
> > > > > > > > > > > > > > even
> > > > > > > > > > > > > > > the request rate is low. Arguably we can
> > carefully
> > > > set
> > > > > > > > request
> > > > > > > > > > rate
> > > > > > > > > > > > > quota
> > > > > > > > > > > > > > > for each request and client id combination, but
> > it
> > > > > could
> > > > > > > > still
> > > > > > > > > be
> > > > > > > > > > > > > tricky
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > get it right for everyone.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > If we use the request time handling quota, we
> can
> > > > > simply
> > > > > > > say
> > > > > > > > no
> > > > > > > > > > > > clients
> > > > > > > > > > > > > > can
> > > > > > > > > > > > > > > take up to more than 30% of the total request
> > > > handling
> > > > > > > > capacity
> > > > > > > > > > > > > (measured
> > > > > > > > > > > > > > > by time), regardless of the difference among
> > > > different
> > > > > > > > requests
> > > > > > > > > > or
> > > > > > > > > > > > what
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > the client doing. In this case maybe we can
> quota
> > > all
> > > > > the
> > > > > > > > > > requests
> > > > > > > > > > > if
> > > > > > > > > > > > > we
> > > > > > > > > > > > > > > want to.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > 2. The main benefit of using request rate limit
> > is
> > > > that
> > > > > > it
> > > > > > > > > seems
> > > > > > > > > > > more
> > > > > > > > > > > > > > > intuitive. It is true that it is probably
> easier
> > to
> > > > > > explain
> > > > > > > > to
> > > > > > > > > > the
> > > > > > > > > > > > user
> > > > > > > > > > > > > > > what does that mean. However, in practice it
> > looks
> > > > the
> > > > > > > impact
> > > > > > > > > of
> > > > > > > > > > > > > request
> > > > > > > > > > > > > > > rate quota is not more quantifiable than the
> > > request
> > > > > > > handling
> > > > > > > > > > time
> > > > > > > > > > > > > quota.
> > > > > > > > > > > > > > > Unlike the byte rate quota, it is still
> difficult
> > > to
> > > > > > give a
> > > > > > > > > > number
> > > > > > > > > > > > > about
> > > > > > > > > > > > > > > impact of throughput or latency when a request
> > rate
> > > > > quota
> > > > > > > is
> > > > > > > > > hit.
> > > > > > > > > > > So
> > > > > > > > > > > > it
> > > > > > > > > > > > > > is
> > > > > > > > > > > > > > > not better than the request handling time
> quota.
> > In
> > > > > fact
> > > > > > I
> > > > > > > > feel
> > > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > > > > clearer to tell user that "you are limited
> > because
> > > > you
> > > > > > have
> > > > > > > > > taken
> > > > > > > > > > > 30%
> > > > > > > > > > > > > of
> > > > > > > > > > > > > > > the CPU time on the broker" than otherwise
> > > something
> > > > > like
> > > > > > > > "your
> > > > > > > > > > > > request
> > > > > > > > > > > > > > > rate quota on metadata request has reached".
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <
> > > > > > > jay@confluent.io
> > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I think this proposal makes a lot of sense
> > > > > (especially
> > > > > > > now
> > > > > > > > > that
> > > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > > > > oriented around request rate) and fills the
> > > biggest
> > > > > > > > remaining
> > > > > > > > > > gap
> > > > > > > > > > > > in
> > > > > > > > > > > > > > the
> > > > > > > > > > > > > > > > multi-tenancy story.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I think for intra-cluster communication
> > > > (StopReplica,
> > > > > > > etc)
> > > > > > > > we
> > > > > > > > > > > could
> > > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > > throttling entirely. You can secure or
> > otherwise
> > > > > > > lock-down
> > > > > > > > > the
> > > > > > > > > > > > > cluster
> > > > > > > > > > > > > > > > communication to avoid any unauthorized
> > external
> > > > > party
> > > > > > > from
> > > > > > > > > > > trying
> > > > > > > > > > > > to
> > > > > > > > > > > > > > > > initiate these requests. As a result we are
> as
> > > > likely
> > > > > > to
> > > > > > > > > cause
> > > > > > > > > > > > > problems
> > > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > > solve them by throttling these, right?
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I'm not so sure that we should exempt the
> > > consumer
> > > > > > > requests
> > > > > > > > > > such
> > > > > > > > > > > as
> > > > > > > > > > > > > > > > heartbeat. It's true that if we throttle an
> > app's
> > > > > > > heartbeat
> > > > > > > > > > > > requests
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > > cause it to fall out of its consumer group.
> > > However
> > > > > if
> > > > > > we
> > > > > > > > > don't
> > > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > > it may DDOS the cluster if the heartbeat
> > interval
> > > > is
> > > > > > set
> > > > > > > > > > > > incorrectly
> > > > > > > > > > > > > or
> > > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > > some client in some language has a bug. I
> think
> > > the
> > > > > > > policy
> > > > > > > > > with
> > > > > > > > > > > > this
> > > > > > > > > > > > > > kind
> > > > > > > > > > > > > > > > of throttling is to protect the cluster above
> > any
> > > > > > > > individual
> > > > > > > > > > app,
> > > > > > > > > > > > > > right?
> > > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > > think in general this should be okay since
> for
> > > most
> > > > > > > > > deployments
> > > > > > > > > > > > this
> > > > > > > > > > > > > > > > setting is meant as more of a safety
> > valve---that
> > > > is
> > > > > > > rather
> > > > > > > > > > than
> > > > > > > > > > > > set
> > > > > > > > > > > > > > > > something very close to what you expect to
> need
> > > > (say
> > > > > 2
> > > > > > > > > req/sec
> > > > > > > > > > or
> > > > > > > > > > > > > > > whatever)
> > > > > > > > > > > > > > > > you would have something quite high (like 100
> > > > > req/sec)
> > > > > > > with
> > > > > > > > > > this
> > > > > > > > > > > > > meant
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > prevent a client gone crazy. I think when
> used
> > > this
> > > > > way
> > > > > > > > > > allowing
> > > > > > > > > > > > > those
> > > > > > > > > > > > > > to
> > > > > > > > > > > > > > > > be throttled would actually provide
> meaningful
> > > > > > > protection.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > -Jay
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini
> > Sivaram <
> > > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > I have just created KIP-124 to introduce
> > > request
> > > > > rate
> > > > > > > > > quotas
> > > > > > > > > > to
> > > > > > > > > > > > > > Kafka:
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > https://cwiki.apache.org/
> > > > > > confluence/display/KAFKA/KIP-
> > > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > The proposal is for a simple percentage
> > request
> > > > > > > handling
> > > > > > > > > time
> > > > > > > > > > > > quota
> > > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > > can be allocated to *<client-id>*, *<user>*
> > or
> > > > > > *<user,
> > > > > > > > > > > > client-id>*.
> > > > > > > > > > > > > > > There
> > > > > > > > > > > > > > > > > are a few other suggestions also under
> > > "Rejected
> > > > > > > > > > alternatives".
> > > > > > > > > > > > > > > Feedback
> > > > > > > > > > > > > > > > > and suggestions are welcome.
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Thank you...
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > --
> > > > > > > > > > > > > -- Guozhang
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jay Kreps <ja...@confluent.io>.
A few minor comments:

   1. Isn't it the case that the throttling time response field should have
   the total time your request was throttled irrespective of the quotas that
   caused that. Limiting it to byte rate quota doesn't make sense, but I also
   I don't think we want to end up adding new fields in the response for every
   single thing we quota, right?
   2. I don't think we should make this quota specifically about io
   threads. Once we introduce these quotas people set them and expect them to
   be enforced (and if they aren't it may cause an outage). As a result they
   are a bit more sensitive than normal configs, I think. The current thread
   pools seem like something of an implementation detail and not the level the
   user-facing quotas should be involved with. I think it might be better to
   make this a general request-time throttle with no mention in the naming
   about I/O threads and simply acknowledge the current limitation (which we
   may someday fix) in the docs that this covers only the time after the
   thread is read off the network.
   3. As such I think the right interface to the user would be something
   like percent_request_time and be in {0,...100} or request_time_ratio and be
   in {0.0,...,1.0} (I think "ratio" is the terminology we used if the scale
   is between 0 and 1 in the other metrics, right?)

-Jay

On Thu, Feb 23, 2017 at 3:45 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Guozhang/Dong,
>
> Thank you for the feedback.
>
> Guozhang : I have updated the section on co-existence of byte rate and
> request time quotas.
>
> Dong: I hadn't added much detail to the metrics and sensors since they are
> going to be very similar to the existing metrics and sensors. To avoid
> confusion, I have now added more detail. All metrics are in the group
> "quotaType" and all sensors have names starting with "quotaType" (where
> quotaType is Produce/Fetch/LeaderReplication/
> FollowerReplication/*IOThread*).
> So there will be no reuse of existing metrics/sensors. The new ones for
> request processing time based throttling will be completely independent of
> existing metrics/sensors, but will be consistent in format.
>
> The existing throttle_time_ms field in produce/fetch responses will not be
> impacted by this KIP. That will continue to return byte-rate based
> throttling times. In addition, a new field request_throttle_time_ms will be
> added to return request quota based throttling times. These will be exposed
> as new metrics on the client-side.
>
> Since all metrics and sensors are different for each type of quota, I
> believe there is already sufficient metrics to monitor throttling on both
> client and broker side for each type of throttling.
>
> Regards,
>
> Rajini
>
>
> On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <li...@gmail.com> wrote:
>
> > Hey Rajini,
> >
> > I think it makes a lot of sense to use io_thread_units as metric to quota
> > user's traffic here. LGTM overall. I have some questions regarding
> sensors.
> >
> > - Can you be more specific in the KIP what sensors will be added? For
> > example, it will be useful to specify the name and attributes of these
> new
> > sensors.
> >
> > - We currently have throttle-time and queue-size for byte-rate based
> quota.
> > Are you going to have separate throttle-time and queue-size for requests
> > throttled by io_thread_unit-based quota, or will they share the same
> > sensor?
> >
> > - Does the throttle-time in the ProduceResponse and FetchResponse
> contains
> > time due to io_thread_unit-based quota?
> >
> > - Currently kafka server doesn't not provide any log or metrics that
> tells
> > whether any given clientId (or user) is throttled. This is not too bad
> > because we can still check the client-side byte-rate metric to validate
> > whether a given client is throttled. But with this io_thread_unit, there
> > will be no way to validate whether a given client is slow because it has
> > exceeded its io_thread_unit limit. It is necessary for user to be able to
> > know this information to figure how whether they have reached there quota
> > limit. How about we add log4j log on the server side to periodically
> print
> > the (client_id, byte-rate-throttle-time, io-thread-unit-throttle-time) so
> > that kafka administrator can figure those users that have reached their
> > limit and act accordingly?
> >
> > Thanks,
> > Dong
> >
> >
> >
> >
> >
> > On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > Made a pass over the doc, overall LGTM except a minor comment on the
> > > throttling implementation:
> > >
> > > Stated as "Request processing time throttling will be applied on top if
> > > necessary." I thought that it meant the request processing time
> > throttling
> > > is applied first, but continue reading I found it actually meant to
> apply
> > > produce / fetch byte rate throttling first.
> > >
> > > Also the last sentence "The remaining delay if any is applied to the
> > > response." is a bit confusing to me. Maybe rewording it a bit?
> > >
> > >
> > > Guozhang
> > >
> > >
> > > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Rajini,
> > > >
> > > > Thanks for the updated KIP. The latest proposal looks good to me.
> > > >
> > > > Jun
> > > >
> > > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> > rajinisivaram@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Jun/Roger,
> > > > >
> > > > > Thank you for the feedback.
> > > > >
> > > > > 1. I have updated the KIP to use absolute units instead of
> > percentage.
> > > > The
> > > > > property is called* io_thread_units* to align with the thread count
> > > > > property *num.io.threads*. When we implement network thread
> > utilization
> > > > > quotas, we can add another property *network_thread_units.*
> > > > >
> > > > > 2. ControlledShutdown is already listed under the exempt requests.
> > Jun,
> > > > did
> > > > > you mean a different request that needs to be added? The four
> > requests
> > > > > currently exempt in the KIP are StopReplica, ControlledShutdown,
> > > > > LeaderAndIsr and UpdateMetadata. These are controlled using
> > > ClusterAction
> > > > > ACL, so it is easy to exclude and only throttle if unauthorized. I
> > > wasn't
> > > > > sure if there are other requests used only for inter-broker that
> > needed
> > > > to
> > > > > be excluded.
> > > > >
> > > > > 3. I was thinking the smallest change would be to replace all
> > > references
> > > > to
> > > > > *requestChannel.sendResponse()* with a local method
> > > > > *sendResponseMaybeThrottle()* that does the throttling if any plus
> > send
> > > > > response. If we throttle first in *KafkaApis.handle()*, the time
> > spent
> > > > > within the method handling the request will not be recorded or used
> > in
> > > > > throttling. We can look into this again when the PR is ready for
> > > review.
> > > > >
> > > > > Regards,
> > > > >
> > > > > Rajini
> > > > >
> > > > >
> > > > >
> > > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> > roger.hoover@gmail.com>
> > > > > wrote:
> > > > >
> > > > > > Great to see this KIP and the excellent discussion.
> > > > > >
> > > > > > To me, Jun's suggestion makes sense.  If my application is
> > allocated
> > > 1
> > > > > > request handler unit, then it's as if I have a Kafka broker with
> a
> > > > single
> > > > > > request handler thread dedicated to me.  That's the most I can
> use,
> > > at
> > > > > > least.  That allocation doesn't change even if an admin later
> > > increases
> > > > > the
> > > > > > size of the request thread pool on the broker.  It's similar to
> the
> > > CPU
> > > > > > abstraction that VMs and containers get from hypervisors or OS
> > > > > schedulers.
> > > > > > While different client access patterns can use wildly different
> > > amounts
> > > > > of
> > > > > > request thread resources per request, a given application will
> > > > generally
> > > > > > have a stable access pattern and can figure out empirically how
> > many
> > > > > > "request thread units" it needs to meet it's throughput/latency
> > > goals.
> > > > > >
> > > > > > Cheers,
> > > > > >
> > > > > > Roger
> > > > > >
> > > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <ju...@confluent.io>
> wrote:
> > > > > >
> > > > > > > Hi, Rajini,
> > > > > > >
> > > > > > > Thanks for the updated KIP. A few more comments.
> > > > > > >
> > > > > > > 1. A concern of request_time_percent is that it's not an
> absolute
> > > > > value.
> > > > > > > Let's say you give a user a 10% limit. If the admin doubles the
> > > > number
> > > > > of
> > > > > > > request handler threads, that user now actually has twice the
> > > > absolute
> > > > > > > capacity. This may confuse people a bit. So, perhaps setting
> the
> > > > quota
> > > > > > > based on an absolute request thread unit is better.
> > > > > > >
> > > > > > > 2. ControlledShutdownRequest is also an inter-broker request
> and
> > > > needs
> > > > > to
> > > > > > > be excluded from throttling.
> > > > > > >
> > > > > > > 3. Implementation wise, I am wondering if it's simpler to apply
> > the
> > > > > > request
> > > > > > > time throttling first in KafkaApis.handle(). Otherwise, we will
> > > need
> > > > to
> > > > > > add
> > > > > > > the throttling logic in each type of request.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> > > > > rajinisivaram@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Jun,
> > > > > > > >
> > > > > > > > Thank you for the review.
> > > > > > > >
> > > > > > > > I have reverted to the original KIP that throttles based on
> > > request
> > > > > > > handler
> > > > > > > > utilization. At the moment, it uses percentage, but I am
> happy
> > to
> > > > > > change
> > > > > > > to
> > > > > > > > a fraction (out of 1 instead of 100) if required. I have
> added
> > > the
> > > > > > > examples
> > > > > > > > from this discussion to the KIP. Also added a "Future Work"
> > > section
> > > > > to
> > > > > > > > address network thread utilization. The configuration is
> named
> > > > > > > > "request_time_percent" with the expectation that it can also
> be
> > > > used
> > > > > as
> > > > > > > the
> > > > > > > > limit for network thread utilization when that is
> implemented,
> > so
> > > > > that
> > > > > > > > users have to set only one config for the two and not have to
> > > worry
> > > > > > about
> > > > > > > > the internal distribution of the work between the two thread
> > > pools
> > > > in
> > > > > > > > Kafka.
> > > > > > > >
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > Rajini
> > > > > > > >
> > > > > > > >
> > > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <ju...@confluent.io>
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hi, Rajini,
> > > > > > > > >
> > > > > > > > > Thanks for the proposal.
> > > > > > > > >
> > > > > > > > > The benefit of using the request processing time over the
> > > request
> > > > > > rate
> > > > > > > is
> > > > > > > > > exactly what people have said. I will just expand that a
> bit.
> > > > > > Consider
> > > > > > > > the
> > > > > > > > > following case. The producer sends a produce request with a
> > > 10MB
> > > > > > > message
> > > > > > > > > but compressed to 100KB with gzip. The decompression of the
> > > > message
> > > > > > on
> > > > > > > > the
> > > > > > > > > broker could take 10-15 seconds, during which time, a
> request
> > > > > handler
> > > > > > > > > thread is completely blocked. In this case, neither the
> > byte-in
> > > > > quota
> > > > > > > nor
> > > > > > > > > the request rate quota may be effective in protecting the
> > > broker.
> > > > > > > > Consider
> > > > > > > > > another case. A consumer group starts with 10 instances and
> > > later
> > > > > on
> > > > > > > > > switches to 20 instances. The request rate will likely
> > double,
> > > > but
> > > > > > the
> > > > > > > > > actually load on the broker may not double since each fetch
> > > > request
> > > > > > > only
> > > > > > > > > contains half of the partitions. Request rate quota may not
> > be
> > > > easy
> > > > > > to
> > > > > > > > > configure in this case.
> > > > > > > > >
> > > > > > > > > What we really want is to be able to prevent a client from
> > > using
> > > > > too
> > > > > > > much
> > > > > > > > > of the server side resources. In this particular KIP, this
> > > > resource
> > > > > > is
> > > > > > > > the
> > > > > > > > > capacity of the request handler threads. I agree that it
> may
> > > not
> > > > be
> > > > > > > > > intuitive for the users to determine how to set the right
> > > limit.
> > > > > > > However,
> > > > > > > > > this is not completely new and has been done in the
> container
> > > > world
> > > > > > > > > already. For example, Linux cgroup (
> > https://access.redhat.com/
> > > > > > > > > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > > > > > > > > Resource_Management_Guide/sec-cpu.html) has the concept of
> > > > > > > > > cpu.cfs_quota_us,
> > > > > > > > > which specifies the total amount of time in microseconds
> for
> > > > which
> > > > > > all
> > > > > > > > > tasks in a cgroup can run during a one second period. We
> can
> > > > > > > potentially
> > > > > > > > > model the request handler threads in a similar way. For
> > > example,
> > > > > each
> > > > > > > > > request handler thread can be 1 request handler unit and
> the
> > > > admin
> > > > > > can
> > > > > > > > > configure a limit on how many units (say 0.01) a client can
> > > have.
> > > > > > > > >
> > > > > > > > > Regarding not throttling the internal broker to broker
> > > requests.
> > > > We
> > > > > > > could
> > > > > > > > > do that. Alternatively, we could just let the admin
> > configure a
> > > > > high
> > > > > > > > limit
> > > > > > > > > for the kafka user (it may not be able to do that easily
> > based
> > > on
> > > > > > > > clientId
> > > > > > > > > though).
> > > > > > > > >
> > > > > > > > > Ideally we want to be able to protect the utilization of
> the
> > > > > network
> > > > > > > > thread
> > > > > > > > > pool too. The difficult is mostly what Rajini said: (1) The
> > > > > mechanism
> > > > > > > for
> > > > > > > > > throttling the requests is through Purgatory and we will
> have
> > > to
> > > > > > think
> > > > > > > > > through how to integrate that into the network layer.  (2)
> In
> > > the
> > > > > > > network
> > > > > > > > > layer, currently we know the user, but not the clientId of
> > the
> > > > > > request.
> > > > > > > > So,
> > > > > > > > > it's a bit tricky to throttle based on clientId there.
> Plus,
> > > the
> > > > > > > byteOut
> > > > > > > > > quota can already protect the network thread utilization
> for
> > > > fetch
> > > > > > > > > requests. So, if we can't figure out this part right now,
> > just
> > > > > > focusing
> > > > > > > > on
> > > > > > > > > the request handling threads for this KIP is still a useful
> > > > > feature.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jun
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> > > > > > > rajinisivaram@gmail.com
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Thank you all for the feedback.
> > > > > > > > > >
> > > > > > > > > > Jay: I have removed exemption for consumer heartbeat etc.
> > > Agree
> > > > > > that
> > > > > > > > > > protecting the cluster is more important than protecting
> > > > > individual
> > > > > > > > apps.
> > > > > > > > > > Have retained the exemption for StopReplicat/LeaderAndIsr
> > > etc,
> > > > > > these
> > > > > > > > are
> > > > > > > > > > throttled only if authorization fails (so can't be used
> for
> > > DoS
> > > > > > > attacks
> > > > > > > > > in
> > > > > > > > > > a secure cluster, but allows inter-broker requests to
> > > complete
> > > > > > > without
> > > > > > > > > > delays).
> > > > > > > > > >
> > > > > > > > > > I will wait another day to see if these is any objection
> to
> > > > > quotas
> > > > > > > > based
> > > > > > > > > on
> > > > > > > > > > request processing time (as opposed to request rate) and
> if
> > > > there
> > > > > > are
> > > > > > > > no
> > > > > > > > > > objections, I will revert to the original proposal with
> > some
> > > > > > changes.
> > > > > > > > > >
> > > > > > > > > > The original proposal was only including the time used by
> > the
> > > > > > request
> > > > > > > > > > handler threads (that made calculation easy). I think the
> > > > > > suggestion
> > > > > > > is
> > > > > > > > > to
> > > > > > > > > > include the time spent in the network threads as well
> since
> > > > that
> > > > > > may
> > > > > > > be
> > > > > > > > > > significant. As Jay pointed out, it is more complicated
> to
> > > > > > calculate
> > > > > > > > the
> > > > > > > > > > total available CPU time and convert to a ratio when
> there
> > > *m*
> > > > > I/O
> > > > > > > > > threads
> > > > > > > > > > and *n* network threads. ThreadMXBean#getThreadCPUTime()
> > may
> > > > > give
> > > > > > us
> > > > > > > > > what
> > > > > > > > > > we want, but it can be very expensive on some platforms.
> As
> > > > > Becket
> > > > > > > and
> > > > > > > > > > Guozhang have pointed out, we do have several time
> > > measurements
> > > > > > > already
> > > > > > > > > for
> > > > > > > > > > generating metrics that we could use, though we might
> want
> > to
> > > > > > switch
> > > > > > > to
> > > > > > > > > > nanoTime() instead of currentTimeMillis() since some of
> the
> > > > > values
> > > > > > > for
> > > > > > > > > > small requests may be < 1ms. But rather than add up the
> > time
> > > > > spent
> > > > > > in
> > > > > > > > I/O
> > > > > > > > > > thread and network thread, wouldn't it be better to
> convert
> > > the
> > > > > > time
> > > > > > > > > spent
> > > > > > > > > > on each thread into a separate ratio? UserA has a request
> > > quota
> > > > > of
> > > > > > > 5%.
> > > > > > > > > Can
> > > > > > > > > > we take that to mean that UserA can use 5% of the time on
> > > > network
> > > > > > > > threads
> > > > > > > > > > and 5% of the time on I/O threads? If either is exceeded,
> > the
> > > > > > > response
> > > > > > > > is
> > > > > > > > > > throttled - it would mean maintaining two sets of metrics
> > for
> > > > the
> > > > > > two
> > > > > > > > > > durations, but would result in more meaningful ratios. We
> > > could
> > > > > > > define
> > > > > > > > > two
> > > > > > > > > > quota limits (UserA has 5% of request threads and 10% of
> > > > network
> > > > > > > > > threads),
> > > > > > > > > > but that seems unnecessary and harder to explain to
> users.
> > > > > > > > > >
> > > > > > > > > > Back to why and how quotas are applied to network thread
> > > > > > utilization:
> > > > > > > > > > a) In the case of fetch,  the time spent in the network
> > > thread
> > > > > may
> > > > > > be
> > > > > > > > > > significant and I can see the need to include this. Are
> > there
> > > > > other
> > > > > > > > > > requests where the network thread utilization is
> > significant?
> > > > In
> > > > > > the
> > > > > > > > case
> > > > > > > > > > of fetch, request handler thread utilization would
> throttle
> > > > > clients
> > > > > > > > with
> > > > > > > > > > high request rate, low data volume and fetch byte rate
> > quota
> > > > will
> > > > > > > > > throttle
> > > > > > > > > > clients with high data volume. Network thread utilization
> > is
> > > > > > perhaps
> > > > > > > > > > proportional to the data volume. I am wondering if we
> even
> > > need
> > > > > to
> > > > > > > > > throttle
> > > > > > > > > > based on network thread utilization or whether the data
> > > volume
> > > > > > quota
> > > > > > > > > covers
> > > > > > > > > > this case.
> > > > > > > > > >
> > > > > > > > > > b) At the moment, we record and check for quota violation
> > at
> > > > the
> > > > > > same
> > > > > > > > > time.
> > > > > > > > > > If a quota is violated, the response is delayed. Using
> > Jay'e
> > > > > > example
> > > > > > > of
> > > > > > > > > > disk reads for fetches happening in the network thread,
> We
> > > > can't
> > > > > > > record
> > > > > > > > > and
> > > > > > > > > > delay a response after the disk reads. We could record
> the
> > > time
> > > > > > spent
> > > > > > > > on
> > > > > > > > > > the network thread when the response is complete and
> > > introduce
> > > > a
> > > > > > > delay
> > > > > > > > > for
> > > > > > > > > > handling a subsequent request (separate out recording and
> > > quota
> > > > > > > > violation
> > > > > > > > > > handling in the case of network thread overload). Does
> that
> > > > make
> > > > > > > sense?
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > >
> > > > > > > > > > Rajini
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> > > > > becket.qin@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hey Jay,
> > > > > > > > > > >
> > > > > > > > > > > Yeah, I agree that enforcing the CPU time is a little
> > > > tricky. I
> > > > > > am
> > > > > > > > > > thinking
> > > > > > > > > > > that maybe we can use the existing request statistics.
> > They
> > > > are
> > > > > > > > already
> > > > > > > > > > > very detailed so we can probably see the approximate
> CPU
> > > time
> > > > > > from
> > > > > > > > it,
> > > > > > > > > > e.g.
> > > > > > > > > > > something like (total_time -
> request/response_queue_time
> > -
> > > > > > > > > remote_time).
> > > > > > > > > > >
> > > > > > > > > > > I agree with Guozhang that when a user is throttled it
> is
> > > > > likely
> > > > > > > that
> > > > > > > > > we
> > > > > > > > > > > need to see if anything has went wrong first, and if
> the
> > > > users
> > > > > > are
> > > > > > > > well
> > > > > > > > > > > behaving and just need more resources, we will have to
> > bump
> > > > up
> > > > > > the
> > > > > > > > > quota
> > > > > > > > > > > for them. It is true that pre-allocating CPU time quota
> > > > > precisely
> > > > > > > for
> > > > > > > > > the
> > > > > > > > > > > users is difficult. So in practice it would probably be
> > > more
> > > > > like
> > > > > > > > first
> > > > > > > > > > set
> > > > > > > > > > > a relative high protective CPU time quota for everyone
> > and
> > > > > > increase
> > > > > > > > > that
> > > > > > > > > > > for some individual clients on demand.
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <
> > > > > > wangguoz@gmail.com
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > This is a great proposal, glad to see it happening.
> > > > > > > > > > > >
> > > > > > > > > > > > I am inclined to the CPU throttling, or more
> > specifically
> > > > > > > > processing
> > > > > > > > > > time
> > > > > > > > > > > > ratio instead of the request rate throttling as well.
> > > > Becket
> > > > > > has
> > > > > > > > very
> > > > > > > > > > > well
> > > > > > > > > > > > summed my rationales above, and one thing to add here
> > is
> > > > that
> > > > > > the
> > > > > > > > > > former
> > > > > > > > > > > > has a good support for both "protecting against rogue
> > > > > clients"
> > > > > > as
> > > > > > > > > well
> > > > > > > > > > as
> > > > > > > > > > > > "utilizing a cluster for multi-tenancy usage": when
> > > > thinking
> > > > > > > about
> > > > > > > > > how
> > > > > > > > > > to
> > > > > > > > > > > > explain this to the end users, I find it actually
> more
> > > > > natural
> > > > > > > than
> > > > > > > > > the
> > > > > > > > > > > > request rate since as mentioned above, different
> > requests
> > > > > will
> > > > > > > have
> > > > > > > > > > quite
> > > > > > > > > > > > different "cost", and Kafka today already have
> various
> > > > > request
> > > > > > > > types
> > > > > > > > > > > > (produce, fetch, admin, metadata, etc), because of
> that
> > > the
> > > > > > > request
> > > > > > > > > > rate
> > > > > > > > > > > > throttling may not be as effective unless it is set
> > very
> > > > > > > > > > conservatively.
> > > > > > > > > > > >
> > > > > > > > > > > > Regarding to user reactions when they are throttled,
> I
> > > > think
> > > > > it
> > > > > > > may
> > > > > > > > > > > differ
> > > > > > > > > > > > case-by-case, and need to be discovered / guided by
> > > looking
> > > > > at
> > > > > > > > > relative
> > > > > > > > > > > > metrics. So in other words users would not expect to
> > get
> > > > > > > additional
> > > > > > > > > > > > information by simply being told "hey, you are
> > > throttled",
> > > > > > which
> > > > > > > is
> > > > > > > > > all
> > > > > > > > > > > > what throttling does; they need to take a follow-up
> > step
> > > > and
> > > > > > see
> > > > > > > > > "hmm,
> > > > > > > > > > > I'm
> > > > > > > > > > > > throttled probably because of ..", which is by
> looking
> > at
> > > > > other
> > > > > > > > > metric
> > > > > > > > > > > > values: e.g. whether I'm bombarding the brokers with
> > > > metadata
> > > > > > > > > request,
> > > > > > > > > > > > which are usually cheap to handle but I'm sending
> > > thousands
> > > > > per
> > > > > > > > > second;
> > > > > > > > > > > or
> > > > > > > > > > > > is it because I'm catching up and hence sending very
> > > heavy
> > > > > > > fetching
> > > > > > > > > > > request
> > > > > > > > > > > > with large min.bytes, etc.
> > > > > > > > > > > >
> > > > > > > > > > > > Regarding to the implementation, as once discussed
> with
> > > > Jun,
> > > > > > this
> > > > > > > > > seems
> > > > > > > > > > > not
> > > > > > > > > > > > very difficult since today we are already collecting
> > the
> > > > > > "thread
> > > > > > > > pool
> > > > > > > > > > > > utilization" metrics, which is a single percentage
> > > > > > > > > "aggregateIdleMeter"
> > > > > > > > > > > > value; but we are already effectively aggregating it
> > for
> > > > each
> > > > > > > > > requests
> > > > > > > > > > in
> > > > > > > > > > > > KafkaRequestHandler, and we can just extend it by
> > > recording
> > > > > the
> > > > > > > > > source
> > > > > > > > > > > > client id when handling them and aggregating by
> > clientId
> > > as
> > > > > > well
> > > > > > > as
> > > > > > > > > the
> > > > > > > > > > > > total aggregate.
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > Guozhang
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <
> > > > jay@confluent.io
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hey Becket/Rajini,
> > > > > > > > > > > > >
> > > > > > > > > > > > > When I thought about it more deeply I came around
> to
> > > the
> > > > > > > "percent
> > > > > > > > > of
> > > > > > > > > > > > > processing time" metric too. It seems a lot closer
> to
> > > the
> > > > > > thing
> > > > > > > > we
> > > > > > > > > > > > actually
> > > > > > > > > > > > > care about and need to protect. I also think this
> > would
> > > > be
> > > > > a
> > > > > > > very
> > > > > > > > > > > useful
> > > > > > > > > > > > > metric even in the absence of throttling just to
> > debug
> > > > > whose
> > > > > > > > using
> > > > > > > > > > > > > capacity.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Two problems to consider:
> > > > > > > > > > > > >
> > > > > > > > > > > > >    1. I agree that for the user it is
> understandable
> > > what
> > > > > > lead
> > > > > > > to
> > > > > > > > > > their
> > > > > > > > > > > > >    being throttled, but it is a bit hard to figure
> > out
> > > > the
> > > > > > safe
> > > > > > > > > range
> > > > > > > > > > > for
> > > > > > > > > > > > >    them. i.e. if I have a new app that will send
> 200
> > > > > > > > messages/sec I
> > > > > > > > > > can
> > > > > > > > > > > > >    probably reason that I'll be under the
> throttling
> > > > limit
> > > > > of
> > > > > > > 300
> > > > > > > > > > > > req/sec.
> > > > > > > > > > > > >    However if I need to be under a 10% CPU
> resources
> > > > limit
> > > > > it
> > > > > > > may
> > > > > > > > > be
> > > > > > > > > > a
> > > > > > > > > > > > bit
> > > > > > > > > > > > >    harder for me to know a priori if i will or
> won't.
> > > > > > > > > > > > >    2. Calculating the available CPU time is a bit
> > > > difficult
> > > > > > > since
> > > > > > > > > > there
> > > > > > > > > > > > are
> > > > > > > > > > > > >    actually two thread pools--the I/O threads and
> the
> > > > > network
> > > > > > > > > > threads.
> > > > > > > > > > > I
> > > > > > > > > > > > > think
> > > > > > > > > > > > >    it might be workable to count just the I/O
> thread
> > > time
> > > > > as
> > > > > > in
> > > > > > > > the
> > > > > > > > > > > > > proposal,
> > > > > > > > > > > > >    but the network thread work is actually
> > non-trivial
> > > > > (e.g.
> > > > > > > all
> > > > > > > > > the
> > > > > > > > > > > disk
> > > > > > > > > > > > >    reads for fetches happen in that thread). If you
> > > count
> > > > > > both
> > > > > > > > the
> > > > > > > > > > > > network
> > > > > > > > > > > > > and
> > > > > > > > > > > > >    I/O threads it can skew things a bit. E.g. say
> you
> > > > have
> > > > > 50
> > > > > > > > > network
> > > > > > > > > > > > > threads,
> > > > > > > > > > > > >    10 I/O threads, and 8 cores, what is the
> available
> > > cpu
> > > > > > time
> > > > > > > > > > > available
> > > > > > > > > > > > > in a
> > > > > > > > > > > > >    second? I suppose this is a problem whenever you
> > > have
> > > > a
> > > > > > > > > bottleneck
> > > > > > > > > > > > > between
> > > > > > > > > > > > >    I/O and network threads or if you end up
> > > significantly
> > > > > > > > > > > > over-provisioning
> > > > > > > > > > > > >    one pool (both of which are hard to avoid).
> > > > > > > > > > > > >
> > > > > > > > > > > > > An alternative for CPU throttling would be to use
> > this
> > > > api:
> > > > > > > > > > > > > http://docs.oracle.com/javase/
> > > 1.5.0/docs/api/java/lang/
> > > > > > > > > > > > > management/ThreadMXBean.html#
> getThreadCpuTime(long)
> > > > > > > > > > > > >
> > > > > > > > > > > > > That would let you track actual CPU usage across
> the
> > > > > network,
> > > > > > > I/O
> > > > > > > > > > > > threads,
> > > > > > > > > > > > > and purgatory threads and look at it as a
> percentage
> > of
> > > > > total
> > > > > > > > > cores.
> > > > > > > > > > I
> > > > > > > > > > > > > think this fixes many problems in the reliability
> of
> > > the
> > > > > > > metric.
> > > > > > > > > It's
> > > > > > > > > > > > > meaning is slightly different as it is just CPU
> (you
> > > > don't
> > > > > > get
> > > > > > > > > > charged
> > > > > > > > > > > > for
> > > > > > > > > > > > > time blocking on I/O) but that may be okay because
> we
> > > > > already
> > > > > > > > have
> > > > > > > > > a
> > > > > > > > > > > > > throttle on I/O. The downside is I think it is
> > possible
> > > > > this
> > > > > > > api
> > > > > > > > > can
> > > > > > > > > > be
> > > > > > > > > > > > > disabled or isn't always available and it may also
> be
> > > > > > expensive
> > > > > > > > > (also
> > > > > > > > > > > > I've
> > > > > > > > > > > > > never used it so not sure if it really works the
> way
> > i
> > > > > > think).
> > > > > > > > > > > > >
> > > > > > > > > > > > > -Jay
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> > > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > If the purpose of the KIP is only to protect the
> > > > cluster
> > > > > > from
> > > > > > > > > being
> > > > > > > > > > > > > > overwhelmed by crazy clients and is not intended
> to
> > > > > address
> > > > > > > > > > resource
> > > > > > > > > > > > > > allocation problem among the clients, I am
> > wondering
> > > if
> > > > > > using
> > > > > > > > > > request
> > > > > > > > > > > > > > handling time quota (CPU time quota) is a better
> > > > option.
> > > > > > Here
> > > > > > > > are
> > > > > > > > > > the
> > > > > > > > > > > > > > reasons:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 1. request handling time quota has better
> > protection.
> > > > Say
> > > > > > we
> > > > > > > > have
> > > > > > > > > > > > request
> > > > > > > > > > > > > > rate quota and set that to some value like 100
> > > > > > requests/sec,
> > > > > > > it
> > > > > > > > > is
> > > > > > > > > > > > > possible
> > > > > > > > > > > > > > that some of the requests are very expensive
> > actually
> > > > > take
> > > > > > a
> > > > > > > > lot
> > > > > > > > > of
> > > > > > > > > > > > time
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > handle. In that case a few clients may still
> > occupy a
> > > > lot
> > > > > > of
> > > > > > > > CPU
> > > > > > > > > > time
> > > > > > > > > > > > > even
> > > > > > > > > > > > > > the request rate is low. Arguably we can
> carefully
> > > set
> > > > > > > request
> > > > > > > > > rate
> > > > > > > > > > > > quota
> > > > > > > > > > > > > > for each request and client id combination, but
> it
> > > > could
> > > > > > > still
> > > > > > > > be
> > > > > > > > > > > > tricky
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > get it right for everyone.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > If we use the request time handling quota, we can
> > > > simply
> > > > > > say
> > > > > > > no
> > > > > > > > > > > clients
> > > > > > > > > > > > > can
> > > > > > > > > > > > > > take up to more than 30% of the total request
> > > handling
> > > > > > > capacity
> > > > > > > > > > > > (measured
> > > > > > > > > > > > > > by time), regardless of the difference among
> > > different
> > > > > > > requests
> > > > > > > > > or
> > > > > > > > > > > what
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > the client doing. In this case maybe we can quota
> > all
> > > > the
> > > > > > > > > requests
> > > > > > > > > > if
> > > > > > > > > > > > we
> > > > > > > > > > > > > > want to.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > 2. The main benefit of using request rate limit
> is
> > > that
> > > > > it
> > > > > > > > seems
> > > > > > > > > > more
> > > > > > > > > > > > > > intuitive. It is true that it is probably easier
> to
> > > > > explain
> > > > > > > to
> > > > > > > > > the
> > > > > > > > > > > user
> > > > > > > > > > > > > > what does that mean. However, in practice it
> looks
> > > the
> > > > > > impact
> > > > > > > > of
> > > > > > > > > > > > request
> > > > > > > > > > > > > > rate quota is not more quantifiable than the
> > request
> > > > > > handling
> > > > > > > > > time
> > > > > > > > > > > > quota.
> > > > > > > > > > > > > > Unlike the byte rate quota, it is still difficult
> > to
> > > > > give a
> > > > > > > > > number
> > > > > > > > > > > > about
> > > > > > > > > > > > > > impact of throughput or latency when a request
> rate
> > > > quota
> > > > > > is
> > > > > > > > hit.
> > > > > > > > > > So
> > > > > > > > > > > it
> > > > > > > > > > > > > is
> > > > > > > > > > > > > > not better than the request handling time quota.
> In
> > > > fact
> > > > > I
> > > > > > > feel
> > > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > > > > clearer to tell user that "you are limited
> because
> > > you
> > > > > have
> > > > > > > > taken
> > > > > > > > > > 30%
> > > > > > > > > > > > of
> > > > > > > > > > > > > > the CPU time on the broker" than otherwise
> > something
> > > > like
> > > > > > > "your
> > > > > > > > > > > request
> > > > > > > > > > > > > > rate quota on metadata request has reached".
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <
> > > > > > jay@confluent.io
> > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think this proposal makes a lot of sense
> > > > (especially
> > > > > > now
> > > > > > > > that
> > > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > > > > oriented around request rate) and fills the
> > biggest
> > > > > > > remaining
> > > > > > > > > gap
> > > > > > > > > > > in
> > > > > > > > > > > > > the
> > > > > > > > > > > > > > > multi-tenancy story.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I think for intra-cluster communication
> > > (StopReplica,
> > > > > > etc)
> > > > > > > we
> > > > > > > > > > could
> > > > > > > > > > > > > avoid
> > > > > > > > > > > > > > > throttling entirely. You can secure or
> otherwise
> > > > > > lock-down
> > > > > > > > the
> > > > > > > > > > > > cluster
> > > > > > > > > > > > > > > communication to avoid any unauthorized
> external
> > > > party
> > > > > > from
> > > > > > > > > > trying
> > > > > > > > > > > to
> > > > > > > > > > > > > > > initiate these requests. As a result we are as
> > > likely
> > > > > to
> > > > > > > > cause
> > > > > > > > > > > > problems
> > > > > > > > > > > > > > as
> > > > > > > > > > > > > > > solve them by throttling these, right?
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I'm not so sure that we should exempt the
> > consumer
> > > > > > requests
> > > > > > > > > such
> > > > > > > > > > as
> > > > > > > > > > > > > > > heartbeat. It's true that if we throttle an
> app's
> > > > > > heartbeat
> > > > > > > > > > > requests
> > > > > > > > > > > > it
> > > > > > > > > > > > > > may
> > > > > > > > > > > > > > > cause it to fall out of its consumer group.
> > However
> > > > if
> > > > > we
> > > > > > > > don't
> > > > > > > > > > > > > throttle
> > > > > > > > > > > > > > it
> > > > > > > > > > > > > > > it may DDOS the cluster if the heartbeat
> interval
> > > is
> > > > > set
> > > > > > > > > > > incorrectly
> > > > > > > > > > > > or
> > > > > > > > > > > > > > if
> > > > > > > > > > > > > > > some client in some language has a bug. I think
> > the
> > > > > > policy
> > > > > > > > with
> > > > > > > > > > > this
> > > > > > > > > > > > > kind
> > > > > > > > > > > > > > > of throttling is to protect the cluster above
> any
> > > > > > > individual
> > > > > > > > > app,
> > > > > > > > > > > > > right?
> > > > > > > > > > > > > > I
> > > > > > > > > > > > > > > think in general this should be okay since for
> > most
> > > > > > > > deployments
> > > > > > > > > > > this
> > > > > > > > > > > > > > > setting is meant as more of a safety
> valve---that
> > > is
> > > > > > rather
> > > > > > > > > than
> > > > > > > > > > > set
> > > > > > > > > > > > > > > something very close to what you expect to need
> > > (say
> > > > 2
> > > > > > > > req/sec
> > > > > > > > > or
> > > > > > > > > > > > > > whatever)
> > > > > > > > > > > > > > > you would have something quite high (like 100
> > > > req/sec)
> > > > > > with
> > > > > > > > > this
> > > > > > > > > > > > meant
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > prevent a client gone crazy. I think when used
> > this
> > > > way
> > > > > > > > > allowing
> > > > > > > > > > > > those
> > > > > > > > > > > > > to
> > > > > > > > > > > > > > > be throttled would actually provide meaningful
> > > > > > protection.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > -Jay
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini
> Sivaram <
> > > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > I have just created KIP-124 to introduce
> > request
> > > > rate
> > > > > > > > quotas
> > > > > > > > > to
> > > > > > > > > > > > > Kafka:
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > https://cwiki.apache.org/
> > > > > confluence/display/KAFKA/KIP-
> > > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > The proposal is for a simple percentage
> request
> > > > > > handling
> > > > > > > > time
> > > > > > > > > > > quota
> > > > > > > > > > > > > > that
> > > > > > > > > > > > > > > > can be allocated to *<client-id>*, *<user>*
> or
> > > > > *<user,
> > > > > > > > > > > client-id>*.
> > > > > > > > > > > > > > There
> > > > > > > > > > > > > > > > are a few other suggestions also under
> > "Rejected
> > > > > > > > > alternatives".
> > > > > > > > > > > > > > Feedback
> > > > > > > > > > > > > > > > and suggestions are welcome.
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Thank you...
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > > >
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > --
> > > > > > > > > > > > -- Guozhang
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Guozhang/Dong,

Thank you for the feedback.

Guozhang : I have updated the section on co-existence of byte rate and
request time quotas.

Dong: I hadn't added much detail to the metrics and sensors since they are
going to be very similar to the existing metrics and sensors. To avoid
confusion, I have now added more detail. All metrics are in the group
"quotaType" and all sensors have names starting with "quotaType" (where
quotaType is Produce/Fetch/LeaderReplication/FollowerReplication/*IOThread*).
So there will be no reuse of existing metrics/sensors. The new ones for
request processing time based throttling will be completely independent of
existing metrics/sensors, but will be consistent in format.

The existing throttle_time_ms field in produce/fetch responses will not be
impacted by this KIP. That will continue to return byte-rate based
throttling times. In addition, a new field request_throttle_time_ms will be
added to return request quota based throttling times. These will be exposed
as new metrics on the client-side.

Since all metrics and sensors are different for each type of quota, I
believe there is already sufficient metrics to monitor throttling on both
client and broker side for each type of throttling.

Regards,

Rajini


On Thu, Feb 23, 2017 at 4:32 AM, Dong Lin <li...@gmail.com> wrote:

> Hey Rajini,
>
> I think it makes a lot of sense to use io_thread_units as metric to quota
> user's traffic here. LGTM overall. I have some questions regarding sensors.
>
> - Can you be more specific in the KIP what sensors will be added? For
> example, it will be useful to specify the name and attributes of these new
> sensors.
>
> - We currently have throttle-time and queue-size for byte-rate based quota.
> Are you going to have separate throttle-time and queue-size for requests
> throttled by io_thread_unit-based quota, or will they share the same
> sensor?
>
> - Does the throttle-time in the ProduceResponse and FetchResponse contains
> time due to io_thread_unit-based quota?
>
> - Currently kafka server doesn't not provide any log or metrics that tells
> whether any given clientId (or user) is throttled. This is not too bad
> because we can still check the client-side byte-rate metric to validate
> whether a given client is throttled. But with this io_thread_unit, there
> will be no way to validate whether a given client is slow because it has
> exceeded its io_thread_unit limit. It is necessary for user to be able to
> know this information to figure how whether they have reached there quota
> limit. How about we add log4j log on the server side to periodically print
> the (client_id, byte-rate-throttle-time, io-thread-unit-throttle-time) so
> that kafka administrator can figure those users that have reached their
> limit and act accordingly?
>
> Thanks,
> Dong
>
>
>
>
>
> On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > Made a pass over the doc, overall LGTM except a minor comment on the
> > throttling implementation:
> >
> > Stated as "Request processing time throttling will be applied on top if
> > necessary." I thought that it meant the request processing time
> throttling
> > is applied first, but continue reading I found it actually meant to apply
> > produce / fetch byte rate throttling first.
> >
> > Also the last sentence "The remaining delay if any is applied to the
> > response." is a bit confusing to me. Maybe rewording it a bit?
> >
> >
> > Guozhang
> >
> >
> > On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Rajini,
> > >
> > > Thanks for the updated KIP. The latest proposal looks good to me.
> > >
> > > Jun
> > >
> > > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <
> rajinisivaram@gmail.com
> > >
> > > wrote:
> > >
> > > > Jun/Roger,
> > > >
> > > > Thank you for the feedback.
> > > >
> > > > 1. I have updated the KIP to use absolute units instead of
> percentage.
> > > The
> > > > property is called* io_thread_units* to align with the thread count
> > > > property *num.io.threads*. When we implement network thread
> utilization
> > > > quotas, we can add another property *network_thread_units.*
> > > >
> > > > 2. ControlledShutdown is already listed under the exempt requests.
> Jun,
> > > did
> > > > you mean a different request that needs to be added? The four
> requests
> > > > currently exempt in the KIP are StopReplica, ControlledShutdown,
> > > > LeaderAndIsr and UpdateMetadata. These are controlled using
> > ClusterAction
> > > > ACL, so it is easy to exclude and only throttle if unauthorized. I
> > wasn't
> > > > sure if there are other requests used only for inter-broker that
> needed
> > > to
> > > > be excluded.
> > > >
> > > > 3. I was thinking the smallest change would be to replace all
> > references
> > > to
> > > > *requestChannel.sendResponse()* with a local method
> > > > *sendResponseMaybeThrottle()* that does the throttling if any plus
> send
> > > > response. If we throttle first in *KafkaApis.handle()*, the time
> spent
> > > > within the method handling the request will not be recorded or used
> in
> > > > throttling. We can look into this again when the PR is ready for
> > review.
> > > >
> > > > Regards,
> > > >
> > > > Rajini
> > > >
> > > >
> > > >
> > > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <
> roger.hoover@gmail.com>
> > > > wrote:
> > > >
> > > > > Great to see this KIP and the excellent discussion.
> > > > >
> > > > > To me, Jun's suggestion makes sense.  If my application is
> allocated
> > 1
> > > > > request handler unit, then it's as if I have a Kafka broker with a
> > > single
> > > > > request handler thread dedicated to me.  That's the most I can use,
> > at
> > > > > least.  That allocation doesn't change even if an admin later
> > increases
> > > > the
> > > > > size of the request thread pool on the broker.  It's similar to the
> > CPU
> > > > > abstraction that VMs and containers get from hypervisors or OS
> > > > schedulers.
> > > > > While different client access patterns can use wildly different
> > amounts
> > > > of
> > > > > request thread resources per request, a given application will
> > > generally
> > > > > have a stable access pattern and can figure out empirically how
> many
> > > > > "request thread units" it needs to meet it's throughput/latency
> > goals.
> > > > >
> > > > > Cheers,
> > > > >
> > > > > Roger
> > > > >
> > > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <ju...@confluent.io> wrote:
> > > > >
> > > > > > Hi, Rajini,
> > > > > >
> > > > > > Thanks for the updated KIP. A few more comments.
> > > > > >
> > > > > > 1. A concern of request_time_percent is that it's not an absolute
> > > > value.
> > > > > > Let's say you give a user a 10% limit. If the admin doubles the
> > > number
> > > > of
> > > > > > request handler threads, that user now actually has twice the
> > > absolute
> > > > > > capacity. This may confuse people a bit. So, perhaps setting the
> > > quota
> > > > > > based on an absolute request thread unit is better.
> > > > > >
> > > > > > 2. ControlledShutdownRequest is also an inter-broker request and
> > > needs
> > > > to
> > > > > > be excluded from throttling.
> > > > > >
> > > > > > 3. Implementation wise, I am wondering if it's simpler to apply
> the
> > > > > request
> > > > > > time throttling first in KafkaApis.handle(). Otherwise, we will
> > need
> > > to
> > > > > add
> > > > > > the throttling logic in each type of request.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Jun,
> > > > > > >
> > > > > > > Thank you for the review.
> > > > > > >
> > > > > > > I have reverted to the original KIP that throttles based on
> > request
> > > > > > handler
> > > > > > > utilization. At the moment, it uses percentage, but I am happy
> to
> > > > > change
> > > > > > to
> > > > > > > a fraction (out of 1 instead of 100) if required. I have added
> > the
> > > > > > examples
> > > > > > > from this discussion to the KIP. Also added a "Future Work"
> > section
> > > > to
> > > > > > > address network thread utilization. The configuration is named
> > > > > > > "request_time_percent" with the expectation that it can also be
> > > used
> > > > as
> > > > > > the
> > > > > > > limit for network thread utilization when that is implemented,
> so
> > > > that
> > > > > > > users have to set only one config for the two and not have to
> > worry
> > > > > about
> > > > > > > the internal distribution of the work between the two thread
> > pools
> > > in
> > > > > > > Kafka.
> > > > > > >
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > Rajini
> > > > > > >
> > > > > > >
> > > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <ju...@confluent.io>
> > > wrote:
> > > > > > >
> > > > > > > > Hi, Rajini,
> > > > > > > >
> > > > > > > > Thanks for the proposal.
> > > > > > > >
> > > > > > > > The benefit of using the request processing time over the
> > request
> > > > > rate
> > > > > > is
> > > > > > > > exactly what people have said. I will just expand that a bit.
> > > > > Consider
> > > > > > > the
> > > > > > > > following case. The producer sends a produce request with a
> > 10MB
> > > > > > message
> > > > > > > > but compressed to 100KB with gzip. The decompression of the
> > > message
> > > > > on
> > > > > > > the
> > > > > > > > broker could take 10-15 seconds, during which time, a request
> > > > handler
> > > > > > > > thread is completely blocked. In this case, neither the
> byte-in
> > > > quota
> > > > > > nor
> > > > > > > > the request rate quota may be effective in protecting the
> > broker.
> > > > > > > Consider
> > > > > > > > another case. A consumer group starts with 10 instances and
> > later
> > > > on
> > > > > > > > switches to 20 instances. The request rate will likely
> double,
> > > but
> > > > > the
> > > > > > > > actually load on the broker may not double since each fetch
> > > request
> > > > > > only
> > > > > > > > contains half of the partitions. Request rate quota may not
> be
> > > easy
> > > > > to
> > > > > > > > configure in this case.
> > > > > > > >
> > > > > > > > What we really want is to be able to prevent a client from
> > using
> > > > too
> > > > > > much
> > > > > > > > of the server side resources. In this particular KIP, this
> > > resource
> > > > > is
> > > > > > > the
> > > > > > > > capacity of the request handler threads. I agree that it may
> > not
> > > be
> > > > > > > > intuitive for the users to determine how to set the right
> > limit.
> > > > > > However,
> > > > > > > > this is not completely new and has been done in the container
> > > world
> > > > > > > > already. For example, Linux cgroup (
> https://access.redhat.com/
> > > > > > > > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > > > > > > > Resource_Management_Guide/sec-cpu.html) has the concept of
> > > > > > > > cpu.cfs_quota_us,
> > > > > > > > which specifies the total amount of time in microseconds for
> > > which
> > > > > all
> > > > > > > > tasks in a cgroup can run during a one second period. We can
> > > > > > potentially
> > > > > > > > model the request handler threads in a similar way. For
> > example,
> > > > each
> > > > > > > > request handler thread can be 1 request handler unit and the
> > > admin
> > > > > can
> > > > > > > > configure a limit on how many units (say 0.01) a client can
> > have.
> > > > > > > >
> > > > > > > > Regarding not throttling the internal broker to broker
> > requests.
> > > We
> > > > > > could
> > > > > > > > do that. Alternatively, we could just let the admin
> configure a
> > > > high
> > > > > > > limit
> > > > > > > > for the kafka user (it may not be able to do that easily
> based
> > on
> > > > > > > clientId
> > > > > > > > though).
> > > > > > > >
> > > > > > > > Ideally we want to be able to protect the utilization of the
> > > > network
> > > > > > > thread
> > > > > > > > pool too. The difficult is mostly what Rajini said: (1) The
> > > > mechanism
> > > > > > for
> > > > > > > > throttling the requests is through Purgatory and we will have
> > to
> > > > > think
> > > > > > > > through how to integrate that into the network layer.  (2) In
> > the
> > > > > > network
> > > > > > > > layer, currently we know the user, but not the clientId of
> the
> > > > > request.
> > > > > > > So,
> > > > > > > > it's a bit tricky to throttle based on clientId there. Plus,
> > the
> > > > > > byteOut
> > > > > > > > quota can already protect the network thread utilization for
> > > fetch
> > > > > > > > requests. So, if we can't figure out this part right now,
> just
> > > > > focusing
> > > > > > > on
> > > > > > > > the request handling threads for this KIP is still a useful
> > > > feature.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jun
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> > > > > > rajinisivaram@gmail.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Thank you all for the feedback.
> > > > > > > > >
> > > > > > > > > Jay: I have removed exemption for consumer heartbeat etc.
> > Agree
> > > > > that
> > > > > > > > > protecting the cluster is more important than protecting
> > > > individual
> > > > > > > apps.
> > > > > > > > > Have retained the exemption for StopReplicat/LeaderAndIsr
> > etc,
> > > > > these
> > > > > > > are
> > > > > > > > > throttled only if authorization fails (so can't be used for
> > DoS
> > > > > > attacks
> > > > > > > > in
> > > > > > > > > a secure cluster, but allows inter-broker requests to
> > complete
> > > > > > without
> > > > > > > > > delays).
> > > > > > > > >
> > > > > > > > > I will wait another day to see if these is any objection to
> > > > quotas
> > > > > > > based
> > > > > > > > on
> > > > > > > > > request processing time (as opposed to request rate) and if
> > > there
> > > > > are
> > > > > > > no
> > > > > > > > > objections, I will revert to the original proposal with
> some
> > > > > changes.
> > > > > > > > >
> > > > > > > > > The original proposal was only including the time used by
> the
> > > > > request
> > > > > > > > > handler threads (that made calculation easy). I think the
> > > > > suggestion
> > > > > > is
> > > > > > > > to
> > > > > > > > > include the time spent in the network threads as well since
> > > that
> > > > > may
> > > > > > be
> > > > > > > > > significant. As Jay pointed out, it is more complicated to
> > > > > calculate
> > > > > > > the
> > > > > > > > > total available CPU time and convert to a ratio when there
> > *m*
> > > > I/O
> > > > > > > > threads
> > > > > > > > > and *n* network threads. ThreadMXBean#getThreadCPUTime()
> may
> > > > give
> > > > > us
> > > > > > > > what
> > > > > > > > > we want, but it can be very expensive on some platforms. As
> > > > Becket
> > > > > > and
> > > > > > > > > Guozhang have pointed out, we do have several time
> > measurements
> > > > > > already
> > > > > > > > for
> > > > > > > > > generating metrics that we could use, though we might want
> to
> > > > > switch
> > > > > > to
> > > > > > > > > nanoTime() instead of currentTimeMillis() since some of the
> > > > values
> > > > > > for
> > > > > > > > > small requests may be < 1ms. But rather than add up the
> time
> > > > spent
> > > > > in
> > > > > > > I/O
> > > > > > > > > thread and network thread, wouldn't it be better to convert
> > the
> > > > > time
> > > > > > > > spent
> > > > > > > > > on each thread into a separate ratio? UserA has a request
> > quota
> > > > of
> > > > > > 5%.
> > > > > > > > Can
> > > > > > > > > we take that to mean that UserA can use 5% of the time on
> > > network
> > > > > > > threads
> > > > > > > > > and 5% of the time on I/O threads? If either is exceeded,
> the
> > > > > > response
> > > > > > > is
> > > > > > > > > throttled - it would mean maintaining two sets of metrics
> for
> > > the
> > > > > two
> > > > > > > > > durations, but would result in more meaningful ratios. We
> > could
> > > > > > define
> > > > > > > > two
> > > > > > > > > quota limits (UserA has 5% of request threads and 10% of
> > > network
> > > > > > > > threads),
> > > > > > > > > but that seems unnecessary and harder to explain to users.
> > > > > > > > >
> > > > > > > > > Back to why and how quotas are applied to network thread
> > > > > utilization:
> > > > > > > > > a) In the case of fetch,  the time spent in the network
> > thread
> > > > may
> > > > > be
> > > > > > > > > significant and I can see the need to include this. Are
> there
> > > > other
> > > > > > > > > requests where the network thread utilization is
> significant?
> > > In
> > > > > the
> > > > > > > case
> > > > > > > > > of fetch, request handler thread utilization would throttle
> > > > clients
> > > > > > > with
> > > > > > > > > high request rate, low data volume and fetch byte rate
> quota
> > > will
> > > > > > > > throttle
> > > > > > > > > clients with high data volume. Network thread utilization
> is
> > > > > perhaps
> > > > > > > > > proportional to the data volume. I am wondering if we even
> > need
> > > > to
> > > > > > > > throttle
> > > > > > > > > based on network thread utilization or whether the data
> > volume
> > > > > quota
> > > > > > > > covers
> > > > > > > > > this case.
> > > > > > > > >
> > > > > > > > > b) At the moment, we record and check for quota violation
> at
> > > the
> > > > > same
> > > > > > > > time.
> > > > > > > > > If a quota is violated, the response is delayed. Using
> Jay'e
> > > > > example
> > > > > > of
> > > > > > > > > disk reads for fetches happening in the network thread, We
> > > can't
> > > > > > record
> > > > > > > > and
> > > > > > > > > delay a response after the disk reads. We could record the
> > time
> > > > > spent
> > > > > > > on
> > > > > > > > > the network thread when the response is complete and
> > introduce
> > > a
> > > > > > delay
> > > > > > > > for
> > > > > > > > > handling a subsequent request (separate out recording and
> > quota
> > > > > > > violation
> > > > > > > > > handling in the case of network thread overload). Does that
> > > make
> > > > > > sense?
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > >
> > > > > > > > > Rajini
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> > > > becket.qin@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hey Jay,
> > > > > > > > > >
> > > > > > > > > > Yeah, I agree that enforcing the CPU time is a little
> > > tricky. I
> > > > > am
> > > > > > > > > thinking
> > > > > > > > > > that maybe we can use the existing request statistics.
> They
> > > are
> > > > > > > already
> > > > > > > > > > very detailed so we can probably see the approximate CPU
> > time
> > > > > from
> > > > > > > it,
> > > > > > > > > e.g.
> > > > > > > > > > something like (total_time - request/response_queue_time
> -
> > > > > > > > remote_time).
> > > > > > > > > >
> > > > > > > > > > I agree with Guozhang that when a user is throttled it is
> > > > likely
> > > > > > that
> > > > > > > > we
> > > > > > > > > > need to see if anything has went wrong first, and if the
> > > users
> > > > > are
> > > > > > > well
> > > > > > > > > > behaving and just need more resources, we will have to
> bump
> > > up
> > > > > the
> > > > > > > > quota
> > > > > > > > > > for them. It is true that pre-allocating CPU time quota
> > > > precisely
> > > > > > for
> > > > > > > > the
> > > > > > > > > > users is difficult. So in practice it would probably be
> > more
> > > > like
> > > > > > > first
> > > > > > > > > set
> > > > > > > > > > a relative high protective CPU time quota for everyone
> and
> > > > > increase
> > > > > > > > that
> > > > > > > > > > for some individual clients on demand.
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <
> > > > > wangguoz@gmail.com
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > This is a great proposal, glad to see it happening.
> > > > > > > > > > >
> > > > > > > > > > > I am inclined to the CPU throttling, or more
> specifically
> > > > > > > processing
> > > > > > > > > time
> > > > > > > > > > > ratio instead of the request rate throttling as well.
> > > Becket
> > > > > has
> > > > > > > very
> > > > > > > > > > well
> > > > > > > > > > > summed my rationales above, and one thing to add here
> is
> > > that
> > > > > the
> > > > > > > > > former
> > > > > > > > > > > has a good support for both "protecting against rogue
> > > > clients"
> > > > > as
> > > > > > > > well
> > > > > > > > > as
> > > > > > > > > > > "utilizing a cluster for multi-tenancy usage": when
> > > thinking
> > > > > > about
> > > > > > > > how
> > > > > > > > > to
> > > > > > > > > > > explain this to the end users, I find it actually more
> > > > natural
> > > > > > than
> > > > > > > > the
> > > > > > > > > > > request rate since as mentioned above, different
> requests
> > > > will
> > > > > > have
> > > > > > > > > quite
> > > > > > > > > > > different "cost", and Kafka today already have various
> > > > request
> > > > > > > types
> > > > > > > > > > > (produce, fetch, admin, metadata, etc), because of that
> > the
> > > > > > request
> > > > > > > > > rate
> > > > > > > > > > > throttling may not be as effective unless it is set
> very
> > > > > > > > > conservatively.
> > > > > > > > > > >
> > > > > > > > > > > Regarding to user reactions when they are throttled, I
> > > think
> > > > it
> > > > > > may
> > > > > > > > > > differ
> > > > > > > > > > > case-by-case, and need to be discovered / guided by
> > looking
> > > > at
> > > > > > > > relative
> > > > > > > > > > > metrics. So in other words users would not expect to
> get
> > > > > > additional
> > > > > > > > > > > information by simply being told "hey, you are
> > throttled",
> > > > > which
> > > > > > is
> > > > > > > > all
> > > > > > > > > > > what throttling does; they need to take a follow-up
> step
> > > and
> > > > > see
> > > > > > > > "hmm,
> > > > > > > > > > I'm
> > > > > > > > > > > throttled probably because of ..", which is by looking
> at
> > > > other
> > > > > > > > metric
> > > > > > > > > > > values: e.g. whether I'm bombarding the brokers with
> > > metadata
> > > > > > > > request,
> > > > > > > > > > > which are usually cheap to handle but I'm sending
> > thousands
> > > > per
> > > > > > > > second;
> > > > > > > > > > or
> > > > > > > > > > > is it because I'm catching up and hence sending very
> > heavy
> > > > > > fetching
> > > > > > > > > > request
> > > > > > > > > > > with large min.bytes, etc.
> > > > > > > > > > >
> > > > > > > > > > > Regarding to the implementation, as once discussed with
> > > Jun,
> > > > > this
> > > > > > > > seems
> > > > > > > > > > not
> > > > > > > > > > > very difficult since today we are already collecting
> the
> > > > > "thread
> > > > > > > pool
> > > > > > > > > > > utilization" metrics, which is a single percentage
> > > > > > > > "aggregateIdleMeter"
> > > > > > > > > > > value; but we are already effectively aggregating it
> for
> > > each
> > > > > > > > requests
> > > > > > > > > in
> > > > > > > > > > > KafkaRequestHandler, and we can just extend it by
> > recording
> > > > the
> > > > > > > > source
> > > > > > > > > > > client id when handling them and aggregating by
> clientId
> > as
> > > > > well
> > > > > > as
> > > > > > > > the
> > > > > > > > > > > total aggregate.
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > Guozhang
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <
> > > jay@confluent.io
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hey Becket/Rajini,
> > > > > > > > > > > >
> > > > > > > > > > > > When I thought about it more deeply I came around to
> > the
> > > > > > "percent
> > > > > > > > of
> > > > > > > > > > > > processing time" metric too. It seems a lot closer to
> > the
> > > > > thing
> > > > > > > we
> > > > > > > > > > > actually
> > > > > > > > > > > > care about and need to protect. I also think this
> would
> > > be
> > > > a
> > > > > > very
> > > > > > > > > > useful
> > > > > > > > > > > > metric even in the absence of throttling just to
> debug
> > > > whose
> > > > > > > using
> > > > > > > > > > > > capacity.
> > > > > > > > > > > >
> > > > > > > > > > > > Two problems to consider:
> > > > > > > > > > > >
> > > > > > > > > > > >    1. I agree that for the user it is understandable
> > what
> > > > > lead
> > > > > > to
> > > > > > > > > their
> > > > > > > > > > > >    being throttled, but it is a bit hard to figure
> out
> > > the
> > > > > safe
> > > > > > > > range
> > > > > > > > > > for
> > > > > > > > > > > >    them. i.e. if I have a new app that will send 200
> > > > > > > messages/sec I
> > > > > > > > > can
> > > > > > > > > > > >    probably reason that I'll be under the throttling
> > > limit
> > > > of
> > > > > > 300
> > > > > > > > > > > req/sec.
> > > > > > > > > > > >    However if I need to be under a 10% CPU resources
> > > limit
> > > > it
> > > > > > may
> > > > > > > > be
> > > > > > > > > a
> > > > > > > > > > > bit
> > > > > > > > > > > >    harder for me to know a priori if i will or won't.
> > > > > > > > > > > >    2. Calculating the available CPU time is a bit
> > > difficult
> > > > > > since
> > > > > > > > > there
> > > > > > > > > > > are
> > > > > > > > > > > >    actually two thread pools--the I/O threads and the
> > > > network
> > > > > > > > > threads.
> > > > > > > > > > I
> > > > > > > > > > > > think
> > > > > > > > > > > >    it might be workable to count just the I/O thread
> > time
> > > > as
> > > > > in
> > > > > > > the
> > > > > > > > > > > > proposal,
> > > > > > > > > > > >    but the network thread work is actually
> non-trivial
> > > > (e.g.
> > > > > > all
> > > > > > > > the
> > > > > > > > > > disk
> > > > > > > > > > > >    reads for fetches happen in that thread). If you
> > count
> > > > > both
> > > > > > > the
> > > > > > > > > > > network
> > > > > > > > > > > > and
> > > > > > > > > > > >    I/O threads it can skew things a bit. E.g. say you
> > > have
> > > > 50
> > > > > > > > network
> > > > > > > > > > > > threads,
> > > > > > > > > > > >    10 I/O threads, and 8 cores, what is the available
> > cpu
> > > > > time
> > > > > > > > > > available
> > > > > > > > > > > > in a
> > > > > > > > > > > >    second? I suppose this is a problem whenever you
> > have
> > > a
> > > > > > > > bottleneck
> > > > > > > > > > > > between
> > > > > > > > > > > >    I/O and network threads or if you end up
> > significantly
> > > > > > > > > > > over-provisioning
> > > > > > > > > > > >    one pool (both of which are hard to avoid).
> > > > > > > > > > > >
> > > > > > > > > > > > An alternative for CPU throttling would be to use
> this
> > > api:
> > > > > > > > > > > > http://docs.oracle.com/javase/
> > 1.5.0/docs/api/java/lang/
> > > > > > > > > > > > management/ThreadMXBean.html#getThreadCpuTime(long)
> > > > > > > > > > > >
> > > > > > > > > > > > That would let you track actual CPU usage across the
> > > > network,
> > > > > > I/O
> > > > > > > > > > > threads,
> > > > > > > > > > > > and purgatory threads and look at it as a percentage
> of
> > > > total
> > > > > > > > cores.
> > > > > > > > > I
> > > > > > > > > > > > think this fixes many problems in the reliability of
> > the
> > > > > > metric.
> > > > > > > > It's
> > > > > > > > > > > > meaning is slightly different as it is just CPU (you
> > > don't
> > > > > get
> > > > > > > > > charged
> > > > > > > > > > > for
> > > > > > > > > > > > time blocking on I/O) but that may be okay because we
> > > > already
> > > > > > > have
> > > > > > > > a
> > > > > > > > > > > > throttle on I/O. The downside is I think it is
> possible
> > > > this
> > > > > > api
> > > > > > > > can
> > > > > > > > > be
> > > > > > > > > > > > disabled or isn't always available and it may also be
> > > > > expensive
> > > > > > > > (also
> > > > > > > > > > > I've
> > > > > > > > > > > > never used it so not sure if it really works the way
> i
> > > > > think).
> > > > > > > > > > > >
> > > > > > > > > > > > -Jay
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> > > > > > > becket.qin@gmail.com>
> > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > If the purpose of the KIP is only to protect the
> > > cluster
> > > > > from
> > > > > > > > being
> > > > > > > > > > > > > overwhelmed by crazy clients and is not intended to
> > > > address
> > > > > > > > > resource
> > > > > > > > > > > > > allocation problem among the clients, I am
> wondering
> > if
> > > > > using
> > > > > > > > > request
> > > > > > > > > > > > > handling time quota (CPU time quota) is a better
> > > option.
> > > > > Here
> > > > > > > are
> > > > > > > > > the
> > > > > > > > > > > > > reasons:
> > > > > > > > > > > > >
> > > > > > > > > > > > > 1. request handling time quota has better
> protection.
> > > Say
> > > > > we
> > > > > > > have
> > > > > > > > > > > request
> > > > > > > > > > > > > rate quota and set that to some value like 100
> > > > > requests/sec,
> > > > > > it
> > > > > > > > is
> > > > > > > > > > > > possible
> > > > > > > > > > > > > that some of the requests are very expensive
> actually
> > > > take
> > > > > a
> > > > > > > lot
> > > > > > > > of
> > > > > > > > > > > time
> > > > > > > > > > > > to
> > > > > > > > > > > > > handle. In that case a few clients may still
> occupy a
> > > lot
> > > > > of
> > > > > > > CPU
> > > > > > > > > time
> > > > > > > > > > > > even
> > > > > > > > > > > > > the request rate is low. Arguably we can carefully
> > set
> > > > > > request
> > > > > > > > rate
> > > > > > > > > > > quota
> > > > > > > > > > > > > for each request and client id combination, but it
> > > could
> > > > > > still
> > > > > > > be
> > > > > > > > > > > tricky
> > > > > > > > > > > > to
> > > > > > > > > > > > > get it right for everyone.
> > > > > > > > > > > > >
> > > > > > > > > > > > > If we use the request time handling quota, we can
> > > simply
> > > > > say
> > > > > > no
> > > > > > > > > > clients
> > > > > > > > > > > > can
> > > > > > > > > > > > > take up to more than 30% of the total request
> > handling
> > > > > > capacity
> > > > > > > > > > > (measured
> > > > > > > > > > > > > by time), regardless of the difference among
> > different
> > > > > > requests
> > > > > > > > or
> > > > > > > > > > what
> > > > > > > > > > > > is
> > > > > > > > > > > > > the client doing. In this case maybe we can quota
> all
> > > the
> > > > > > > > requests
> > > > > > > > > if
> > > > > > > > > > > we
> > > > > > > > > > > > > want to.
> > > > > > > > > > > > >
> > > > > > > > > > > > > 2. The main benefit of using request rate limit is
> > that
> > > > it
> > > > > > > seems
> > > > > > > > > more
> > > > > > > > > > > > > intuitive. It is true that it is probably easier to
> > > > explain
> > > > > > to
> > > > > > > > the
> > > > > > > > > > user
> > > > > > > > > > > > > what does that mean. However, in practice it looks
> > the
> > > > > impact
> > > > > > > of
> > > > > > > > > > > request
> > > > > > > > > > > > > rate quota is not more quantifiable than the
> request
> > > > > handling
> > > > > > > > time
> > > > > > > > > > > quota.
> > > > > > > > > > > > > Unlike the byte rate quota, it is still difficult
> to
> > > > give a
> > > > > > > > number
> > > > > > > > > > > about
> > > > > > > > > > > > > impact of throughput or latency when a request rate
> > > quota
> > > > > is
> > > > > > > hit.
> > > > > > > > > So
> > > > > > > > > > it
> > > > > > > > > > > > is
> > > > > > > > > > > > > not better than the request handling time quota. In
> > > fact
> > > > I
> > > > > > feel
> > > > > > > > it
> > > > > > > > > is
> > > > > > > > > > > > > clearer to tell user that "you are limited because
> > you
> > > > have
> > > > > > > taken
> > > > > > > > > 30%
> > > > > > > > > > > of
> > > > > > > > > > > > > the CPU time on the broker" than otherwise
> something
> > > like
> > > > > > "your
> > > > > > > > > > request
> > > > > > > > > > > > > rate quota on metadata request has reached".
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thanks,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <
> > > > > jay@confluent.io
> > > > > > >
> > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > I think this proposal makes a lot of sense
> > > (especially
> > > > > now
> > > > > > > that
> > > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > > > > oriented around request rate) and fills the
> biggest
> > > > > > remaining
> > > > > > > > gap
> > > > > > > > > > in
> > > > > > > > > > > > the
> > > > > > > > > > > > > > multi-tenancy story.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I think for intra-cluster communication
> > (StopReplica,
> > > > > etc)
> > > > > > we
> > > > > > > > > could
> > > > > > > > > > > > avoid
> > > > > > > > > > > > > > throttling entirely. You can secure or otherwise
> > > > > lock-down
> > > > > > > the
> > > > > > > > > > > cluster
> > > > > > > > > > > > > > communication to avoid any unauthorized external
> > > party
> > > > > from
> > > > > > > > > trying
> > > > > > > > > > to
> > > > > > > > > > > > > > initiate these requests. As a result we are as
> > likely
> > > > to
> > > > > > > cause
> > > > > > > > > > > problems
> > > > > > > > > > > > > as
> > > > > > > > > > > > > > solve them by throttling these, right?
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I'm not so sure that we should exempt the
> consumer
> > > > > requests
> > > > > > > > such
> > > > > > > > > as
> > > > > > > > > > > > > > heartbeat. It's true that if we throttle an app's
> > > > > heartbeat
> > > > > > > > > > requests
> > > > > > > > > > > it
> > > > > > > > > > > > > may
> > > > > > > > > > > > > > cause it to fall out of its consumer group.
> However
> > > if
> > > > we
> > > > > > > don't
> > > > > > > > > > > > throttle
> > > > > > > > > > > > > it
> > > > > > > > > > > > > > it may DDOS the cluster if the heartbeat interval
> > is
> > > > set
> > > > > > > > > > incorrectly
> > > > > > > > > > > or
> > > > > > > > > > > > > if
> > > > > > > > > > > > > > some client in some language has a bug. I think
> the
> > > > > policy
> > > > > > > with
> > > > > > > > > > this
> > > > > > > > > > > > kind
> > > > > > > > > > > > > > of throttling is to protect the cluster above any
> > > > > > individual
> > > > > > > > app,
> > > > > > > > > > > > right?
> > > > > > > > > > > > > I
> > > > > > > > > > > > > > think in general this should be okay since for
> most
> > > > > > > deployments
> > > > > > > > > > this
> > > > > > > > > > > > > > setting is meant as more of a safety valve---that
> > is
> > > > > rather
> > > > > > > > than
> > > > > > > > > > set
> > > > > > > > > > > > > > something very close to what you expect to need
> > (say
> > > 2
> > > > > > > req/sec
> > > > > > > > or
> > > > > > > > > > > > > whatever)
> > > > > > > > > > > > > > you would have something quite high (like 100
> > > req/sec)
> > > > > with
> > > > > > > > this
> > > > > > > > > > > meant
> > > > > > > > > > > > to
> > > > > > > > > > > > > > prevent a client gone crazy. I think when used
> this
> > > way
> > > > > > > > allowing
> > > > > > > > > > > those
> > > > > > > > > > > > to
> > > > > > > > > > > > > > be throttled would actually provide meaningful
> > > > > protection.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > -Jay
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > wrote:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > I have just created KIP-124 to introduce
> request
> > > rate
> > > > > > > quotas
> > > > > > > > to
> > > > > > > > > > > > Kafka:
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > https://cwiki.apache.org/
> > > > confluence/display/KAFKA/KIP-
> > > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > The proposal is for a simple percentage request
> > > > > handling
> > > > > > > time
> > > > > > > > > > quota
> > > > > > > > > > > > > that
> > > > > > > > > > > > > > > can be allocated to *<client-id>*, *<user>* or
> > > > *<user,
> > > > > > > > > > client-id>*.
> > > > > > > > > > > > > There
> > > > > > > > > > > > > > > are a few other suggestions also under
> "Rejected
> > > > > > > > alternatives".
> > > > > > > > > > > > > Feedback
> > > > > > > > > > > > > > > and suggestions are welcome.
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Thank you...
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > > >
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > --
> > > > > > > > > > > -- Guozhang
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Dong Lin <li...@gmail.com>.
Hey Rajini,

I think it makes a lot of sense to use io_thread_units as metric to quota
user's traffic here. LGTM overall. I have some questions regarding sensors.

- Can you be more specific in the KIP what sensors will be added? For
example, it will be useful to specify the name and attributes of these new
sensors.

- We currently have throttle-time and queue-size for byte-rate based quota.
Are you going to have separate throttle-time and queue-size for requests
throttled by io_thread_unit-based quota, or will they share the same sensor?

- Does the throttle-time in the ProduceResponse and FetchResponse contains
time due to io_thread_unit-based quota?

- Currently kafka server doesn't not provide any log or metrics that tells
whether any given clientId (or user) is throttled. This is not too bad
because we can still check the client-side byte-rate metric to validate
whether a given client is throttled. But with this io_thread_unit, there
will be no way to validate whether a given client is slow because it has
exceeded its io_thread_unit limit. It is necessary for user to be able to
know this information to figure how whether they have reached there quota
limit. How about we add log4j log on the server side to periodically print
the (client_id, byte-rate-throttle-time, io-thread-unit-throttle-time) so
that kafka administrator can figure those users that have reached their
limit and act accordingly?

Thanks,
Dong





On Wed, Feb 22, 2017 at 4:46 PM, Guozhang Wang <wa...@gmail.com> wrote:

> Made a pass over the doc, overall LGTM except a minor comment on the
> throttling implementation:
>
> Stated as "Request processing time throttling will be applied on top if
> necessary." I thought that it meant the request processing time throttling
> is applied first, but continue reading I found it actually meant to apply
> produce / fetch byte rate throttling first.
>
> Also the last sentence "The remaining delay if any is applied to the
> response." is a bit confusing to me. Maybe rewording it a bit?
>
>
> Guozhang
>
>
> On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Rajini,
> >
> > Thanks for the updated KIP. The latest proposal looks good to me.
> >
> > Jun
> >
> > On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > wrote:
> >
> > > Jun/Roger,
> > >
> > > Thank you for the feedback.
> > >
> > > 1. I have updated the KIP to use absolute units instead of percentage.
> > The
> > > property is called* io_thread_units* to align with the thread count
> > > property *num.io.threads*. When we implement network thread utilization
> > > quotas, we can add another property *network_thread_units.*
> > >
> > > 2. ControlledShutdown is already listed under the exempt requests. Jun,
> > did
> > > you mean a different request that needs to be added? The four requests
> > > currently exempt in the KIP are StopReplica, ControlledShutdown,
> > > LeaderAndIsr and UpdateMetadata. These are controlled using
> ClusterAction
> > > ACL, so it is easy to exclude and only throttle if unauthorized. I
> wasn't
> > > sure if there are other requests used only for inter-broker that needed
> > to
> > > be excluded.
> > >
> > > 3. I was thinking the smallest change would be to replace all
> references
> > to
> > > *requestChannel.sendResponse()* with a local method
> > > *sendResponseMaybeThrottle()* that does the throttling if any plus send
> > > response. If we throttle first in *KafkaApis.handle()*, the time spent
> > > within the method handling the request will not be recorded or used in
> > > throttling. We can look into this again when the PR is ready for
> review.
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > >
> > > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <ro...@gmail.com>
> > > wrote:
> > >
> > > > Great to see this KIP and the excellent discussion.
> > > >
> > > > To me, Jun's suggestion makes sense.  If my application is allocated
> 1
> > > > request handler unit, then it's as if I have a Kafka broker with a
> > single
> > > > request handler thread dedicated to me.  That's the most I can use,
> at
> > > > least.  That allocation doesn't change even if an admin later
> increases
> > > the
> > > > size of the request thread pool on the broker.  It's similar to the
> CPU
> > > > abstraction that VMs and containers get from hypervisors or OS
> > > schedulers.
> > > > While different client access patterns can use wildly different
> amounts
> > > of
> > > > request thread resources per request, a given application will
> > generally
> > > > have a stable access pattern and can figure out empirically how many
> > > > "request thread units" it needs to meet it's throughput/latency
> goals.
> > > >
> > > > Cheers,
> > > >
> > > > Roger
> > > >
> > > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Hi, Rajini,
> > > > >
> > > > > Thanks for the updated KIP. A few more comments.
> > > > >
> > > > > 1. A concern of request_time_percent is that it's not an absolute
> > > value.
> > > > > Let's say you give a user a 10% limit. If the admin doubles the
> > number
> > > of
> > > > > request handler threads, that user now actually has twice the
> > absolute
> > > > > capacity. This may confuse people a bit. So, perhaps setting the
> > quota
> > > > > based on an absolute request thread unit is better.
> > > > >
> > > > > 2. ControlledShutdownRequest is also an inter-broker request and
> > needs
> > > to
> > > > > be excluded from throttling.
> > > > >
> > > > > 3. Implementation wise, I am wondering if it's simpler to apply the
> > > > request
> > > > > time throttling first in KafkaApis.handle(). Otherwise, we will
> need
> > to
> > > > add
> > > > > the throttling logic in each type of request.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> > > rajinisivaram@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Jun,
> > > > > >
> > > > > > Thank you for the review.
> > > > > >
> > > > > > I have reverted to the original KIP that throttles based on
> request
> > > > > handler
> > > > > > utilization. At the moment, it uses percentage, but I am happy to
> > > > change
> > > > > to
> > > > > > a fraction (out of 1 instead of 100) if required. I have added
> the
> > > > > examples
> > > > > > from this discussion to the KIP. Also added a "Future Work"
> section
> > > to
> > > > > > address network thread utilization. The configuration is named
> > > > > > "request_time_percent" with the expectation that it can also be
> > used
> > > as
> > > > > the
> > > > > > limit for network thread utilization when that is implemented, so
> > > that
> > > > > > users have to set only one config for the two and not have to
> worry
> > > > about
> > > > > > the internal distribution of the work between the two thread
> pools
> > in
> > > > > > Kafka.
> > > > > >
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Rajini
> > > > > >
> > > > > >
> > > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <ju...@confluent.io>
> > wrote:
> > > > > >
> > > > > > > Hi, Rajini,
> > > > > > >
> > > > > > > Thanks for the proposal.
> > > > > > >
> > > > > > > The benefit of using the request processing time over the
> request
> > > > rate
> > > > > is
> > > > > > > exactly what people have said. I will just expand that a bit.
> > > > Consider
> > > > > > the
> > > > > > > following case. The producer sends a produce request with a
> 10MB
> > > > > message
> > > > > > > but compressed to 100KB with gzip. The decompression of the
> > message
> > > > on
> > > > > > the
> > > > > > > broker could take 10-15 seconds, during which time, a request
> > > handler
> > > > > > > thread is completely blocked. In this case, neither the byte-in
> > > quota
> > > > > nor
> > > > > > > the request rate quota may be effective in protecting the
> broker.
> > > > > > Consider
> > > > > > > another case. A consumer group starts with 10 instances and
> later
> > > on
> > > > > > > switches to 20 instances. The request rate will likely double,
> > but
> > > > the
> > > > > > > actually load on the broker may not double since each fetch
> > request
> > > > > only
> > > > > > > contains half of the partitions. Request rate quota may not be
> > easy
> > > > to
> > > > > > > configure in this case.
> > > > > > >
> > > > > > > What we really want is to be able to prevent a client from
> using
> > > too
> > > > > much
> > > > > > > of the server side resources. In this particular KIP, this
> > resource
> > > > is
> > > > > > the
> > > > > > > capacity of the request handler threads. I agree that it may
> not
> > be
> > > > > > > intuitive for the users to determine how to set the right
> limit.
> > > > > However,
> > > > > > > this is not completely new and has been done in the container
> > world
> > > > > > > already. For example, Linux cgroup (https://access.redhat.com/
> > > > > > > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > > > > > > Resource_Management_Guide/sec-cpu.html) has the concept of
> > > > > > > cpu.cfs_quota_us,
> > > > > > > which specifies the total amount of time in microseconds for
> > which
> > > > all
> > > > > > > tasks in a cgroup can run during a one second period. We can
> > > > > potentially
> > > > > > > model the request handler threads in a similar way. For
> example,
> > > each
> > > > > > > request handler thread can be 1 request handler unit and the
> > admin
> > > > can
> > > > > > > configure a limit on how many units (say 0.01) a client can
> have.
> > > > > > >
> > > > > > > Regarding not throttling the internal broker to broker
> requests.
> > We
> > > > > could
> > > > > > > do that. Alternatively, we could just let the admin configure a
> > > high
> > > > > > limit
> > > > > > > for the kafka user (it may not be able to do that easily based
> on
> > > > > > clientId
> > > > > > > though).
> > > > > > >
> > > > > > > Ideally we want to be able to protect the utilization of the
> > > network
> > > > > > thread
> > > > > > > pool too. The difficult is mostly what Rajini said: (1) The
> > > mechanism
> > > > > for
> > > > > > > throttling the requests is through Purgatory and we will have
> to
> > > > think
> > > > > > > through how to integrate that into the network layer.  (2) In
> the
> > > > > network
> > > > > > > layer, currently we know the user, but not the clientId of the
> > > > request.
> > > > > > So,
> > > > > > > it's a bit tricky to throttle based on clientId there. Plus,
> the
> > > > > byteOut
> > > > > > > quota can already protect the network thread utilization for
> > fetch
> > > > > > > requests. So, if we can't figure out this part right now, just
> > > > focusing
> > > > > > on
> > > > > > > the request handling threads for this KIP is still a useful
> > > feature.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jun
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> > > > > rajinisivaram@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Thank you all for the feedback.
> > > > > > > >
> > > > > > > > Jay: I have removed exemption for consumer heartbeat etc.
> Agree
> > > > that
> > > > > > > > protecting the cluster is more important than protecting
> > > individual
> > > > > > apps.
> > > > > > > > Have retained the exemption for StopReplicat/LeaderAndIsr
> etc,
> > > > these
> > > > > > are
> > > > > > > > throttled only if authorization fails (so can't be used for
> DoS
> > > > > attacks
> > > > > > > in
> > > > > > > > a secure cluster, but allows inter-broker requests to
> complete
> > > > > without
> > > > > > > > delays).
> > > > > > > >
> > > > > > > > I will wait another day to see if these is any objection to
> > > quotas
> > > > > > based
> > > > > > > on
> > > > > > > > request processing time (as opposed to request rate) and if
> > there
> > > > are
> > > > > > no
> > > > > > > > objections, I will revert to the original proposal with some
> > > > changes.
> > > > > > > >
> > > > > > > > The original proposal was only including the time used by the
> > > > request
> > > > > > > > handler threads (that made calculation easy). I think the
> > > > suggestion
> > > > > is
> > > > > > > to
> > > > > > > > include the time spent in the network threads as well since
> > that
> > > > may
> > > > > be
> > > > > > > > significant. As Jay pointed out, it is more complicated to
> > > > calculate
> > > > > > the
> > > > > > > > total available CPU time and convert to a ratio when there
> *m*
> > > I/O
> > > > > > > threads
> > > > > > > > and *n* network threads. ThreadMXBean#getThreadCPUTime() may
> > > give
> > > > us
> > > > > > > what
> > > > > > > > we want, but it can be very expensive on some platforms. As
> > > Becket
> > > > > and
> > > > > > > > Guozhang have pointed out, we do have several time
> measurements
> > > > > already
> > > > > > > for
> > > > > > > > generating metrics that we could use, though we might want to
> > > > switch
> > > > > to
> > > > > > > > nanoTime() instead of currentTimeMillis() since some of the
> > > values
> > > > > for
> > > > > > > > small requests may be < 1ms. But rather than add up the time
> > > spent
> > > > in
> > > > > > I/O
> > > > > > > > thread and network thread, wouldn't it be better to convert
> the
> > > > time
> > > > > > > spent
> > > > > > > > on each thread into a separate ratio? UserA has a request
> quota
> > > of
> > > > > 5%.
> > > > > > > Can
> > > > > > > > we take that to mean that UserA can use 5% of the time on
> > network
> > > > > > threads
> > > > > > > > and 5% of the time on I/O threads? If either is exceeded, the
> > > > > response
> > > > > > is
> > > > > > > > throttled - it would mean maintaining two sets of metrics for
> > the
> > > > two
> > > > > > > > durations, but would result in more meaningful ratios. We
> could
> > > > > define
> > > > > > > two
> > > > > > > > quota limits (UserA has 5% of request threads and 10% of
> > network
> > > > > > > threads),
> > > > > > > > but that seems unnecessary and harder to explain to users.
> > > > > > > >
> > > > > > > > Back to why and how quotas are applied to network thread
> > > > utilization:
> > > > > > > > a) In the case of fetch,  the time spent in the network
> thread
> > > may
> > > > be
> > > > > > > > significant and I can see the need to include this. Are there
> > > other
> > > > > > > > requests where the network thread utilization is significant?
> > In
> > > > the
> > > > > > case
> > > > > > > > of fetch, request handler thread utilization would throttle
> > > clients
> > > > > > with
> > > > > > > > high request rate, low data volume and fetch byte rate quota
> > will
> > > > > > > throttle
> > > > > > > > clients with high data volume. Network thread utilization is
> > > > perhaps
> > > > > > > > proportional to the data volume. I am wondering if we even
> need
> > > to
> > > > > > > throttle
> > > > > > > > based on network thread utilization or whether the data
> volume
> > > > quota
> > > > > > > covers
> > > > > > > > this case.
> > > > > > > >
> > > > > > > > b) At the moment, we record and check for quota violation at
> > the
> > > > same
> > > > > > > time.
> > > > > > > > If a quota is violated, the response is delayed. Using Jay'e
> > > > example
> > > > > of
> > > > > > > > disk reads for fetches happening in the network thread, We
> > can't
> > > > > record
> > > > > > > and
> > > > > > > > delay a response after the disk reads. We could record the
> time
> > > > spent
> > > > > > on
> > > > > > > > the network thread when the response is complete and
> introduce
> > a
> > > > > delay
> > > > > > > for
> > > > > > > > handling a subsequent request (separate out recording and
> quota
> > > > > > violation
> > > > > > > > handling in the case of network thread overload). Does that
> > make
> > > > > sense?
> > > > > > > >
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > Rajini
> > > > > > > >
> > > > > > > >
> > > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> > > becket.qin@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hey Jay,
> > > > > > > > >
> > > > > > > > > Yeah, I agree that enforcing the CPU time is a little
> > tricky. I
> > > > am
> > > > > > > > thinking
> > > > > > > > > that maybe we can use the existing request statistics. They
> > are
> > > > > > already
> > > > > > > > > very detailed so we can probably see the approximate CPU
> time
> > > > from
> > > > > > it,
> > > > > > > > e.g.
> > > > > > > > > something like (total_time - request/response_queue_time -
> > > > > > > remote_time).
> > > > > > > > >
> > > > > > > > > I agree with Guozhang that when a user is throttled it is
> > > likely
> > > > > that
> > > > > > > we
> > > > > > > > > need to see if anything has went wrong first, and if the
> > users
> > > > are
> > > > > > well
> > > > > > > > > behaving and just need more resources, we will have to bump
> > up
> > > > the
> > > > > > > quota
> > > > > > > > > for them. It is true that pre-allocating CPU time quota
> > > precisely
> > > > > for
> > > > > > > the
> > > > > > > > > users is difficult. So in practice it would probably be
> more
> > > like
> > > > > > first
> > > > > > > > set
> > > > > > > > > a relative high protective CPU time quota for everyone and
> > > > increase
> > > > > > > that
> > > > > > > > > for some individual clients on demand.
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <
> > > > wangguoz@gmail.com
> > > > > >
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > This is a great proposal, glad to see it happening.
> > > > > > > > > >
> > > > > > > > > > I am inclined to the CPU throttling, or more specifically
> > > > > > processing
> > > > > > > > time
> > > > > > > > > > ratio instead of the request rate throttling as well.
> > Becket
> > > > has
> > > > > > very
> > > > > > > > > well
> > > > > > > > > > summed my rationales above, and one thing to add here is
> > that
> > > > the
> > > > > > > > former
> > > > > > > > > > has a good support for both "protecting against rogue
> > > clients"
> > > > as
> > > > > > > well
> > > > > > > > as
> > > > > > > > > > "utilizing a cluster for multi-tenancy usage": when
> > thinking
> > > > > about
> > > > > > > how
> > > > > > > > to
> > > > > > > > > > explain this to the end users, I find it actually more
> > > natural
> > > > > than
> > > > > > > the
> > > > > > > > > > request rate since as mentioned above, different requests
> > > will
> > > > > have
> > > > > > > > quite
> > > > > > > > > > different "cost", and Kafka today already have various
> > > request
> > > > > > types
> > > > > > > > > > (produce, fetch, admin, metadata, etc), because of that
> the
> > > > > request
> > > > > > > > rate
> > > > > > > > > > throttling may not be as effective unless it is set very
> > > > > > > > conservatively.
> > > > > > > > > >
> > > > > > > > > > Regarding to user reactions when they are throttled, I
> > think
> > > it
> > > > > may
> > > > > > > > > differ
> > > > > > > > > > case-by-case, and need to be discovered / guided by
> looking
> > > at
> > > > > > > relative
> > > > > > > > > > metrics. So in other words users would not expect to get
> > > > > additional
> > > > > > > > > > information by simply being told "hey, you are
> throttled",
> > > > which
> > > > > is
> > > > > > > all
> > > > > > > > > > what throttling does; they need to take a follow-up step
> > and
> > > > see
> > > > > > > "hmm,
> > > > > > > > > I'm
> > > > > > > > > > throttled probably because of ..", which is by looking at
> > > other
> > > > > > > metric
> > > > > > > > > > values: e.g. whether I'm bombarding the brokers with
> > metadata
> > > > > > > request,
> > > > > > > > > > which are usually cheap to handle but I'm sending
> thousands
> > > per
> > > > > > > second;
> > > > > > > > > or
> > > > > > > > > > is it because I'm catching up and hence sending very
> heavy
> > > > > fetching
> > > > > > > > > request
> > > > > > > > > > with large min.bytes, etc.
> > > > > > > > > >
> > > > > > > > > > Regarding to the implementation, as once discussed with
> > Jun,
> > > > this
> > > > > > > seems
> > > > > > > > > not
> > > > > > > > > > very difficult since today we are already collecting the
> > > > "thread
> > > > > > pool
> > > > > > > > > > utilization" metrics, which is a single percentage
> > > > > > > "aggregateIdleMeter"
> > > > > > > > > > value; but we are already effectively aggregating it for
> > each
> > > > > > > requests
> > > > > > > > in
> > > > > > > > > > KafkaRequestHandler, and we can just extend it by
> recording
> > > the
> > > > > > > source
> > > > > > > > > > client id when handling them and aggregating by clientId
> as
> > > > well
> > > > > as
> > > > > > > the
> > > > > > > > > > total aggregate.
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > Guozhang
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <
> > jay@confluent.io
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hey Becket/Rajini,
> > > > > > > > > > >
> > > > > > > > > > > When I thought about it more deeply I came around to
> the
> > > > > "percent
> > > > > > > of
> > > > > > > > > > > processing time" metric too. It seems a lot closer to
> the
> > > > thing
> > > > > > we
> > > > > > > > > > actually
> > > > > > > > > > > care about and need to protect. I also think this would
> > be
> > > a
> > > > > very
> > > > > > > > > useful
> > > > > > > > > > > metric even in the absence of throttling just to debug
> > > whose
> > > > > > using
> > > > > > > > > > > capacity.
> > > > > > > > > > >
> > > > > > > > > > > Two problems to consider:
> > > > > > > > > > >
> > > > > > > > > > >    1. I agree that for the user it is understandable
> what
> > > > lead
> > > > > to
> > > > > > > > their
> > > > > > > > > > >    being throttled, but it is a bit hard to figure out
> > the
> > > > safe
> > > > > > > range
> > > > > > > > > for
> > > > > > > > > > >    them. i.e. if I have a new app that will send 200
> > > > > > messages/sec I
> > > > > > > > can
> > > > > > > > > > >    probably reason that I'll be under the throttling
> > limit
> > > of
> > > > > 300
> > > > > > > > > > req/sec.
> > > > > > > > > > >    However if I need to be under a 10% CPU resources
> > limit
> > > it
> > > > > may
> > > > > > > be
> > > > > > > > a
> > > > > > > > > > bit
> > > > > > > > > > >    harder for me to know a priori if i will or won't.
> > > > > > > > > > >    2. Calculating the available CPU time is a bit
> > difficult
> > > > > since
> > > > > > > > there
> > > > > > > > > > are
> > > > > > > > > > >    actually two thread pools--the I/O threads and the
> > > network
> > > > > > > > threads.
> > > > > > > > > I
> > > > > > > > > > > think
> > > > > > > > > > >    it might be workable to count just the I/O thread
> time
> > > as
> > > > in
> > > > > > the
> > > > > > > > > > > proposal,
> > > > > > > > > > >    but the network thread work is actually non-trivial
> > > (e.g.
> > > > > all
> > > > > > > the
> > > > > > > > > disk
> > > > > > > > > > >    reads for fetches happen in that thread). If you
> count
> > > > both
> > > > > > the
> > > > > > > > > > network
> > > > > > > > > > > and
> > > > > > > > > > >    I/O threads it can skew things a bit. E.g. say you
> > have
> > > 50
> > > > > > > network
> > > > > > > > > > > threads,
> > > > > > > > > > >    10 I/O threads, and 8 cores, what is the available
> cpu
> > > > time
> > > > > > > > > available
> > > > > > > > > > > in a
> > > > > > > > > > >    second? I suppose this is a problem whenever you
> have
> > a
> > > > > > > bottleneck
> > > > > > > > > > > between
> > > > > > > > > > >    I/O and network threads or if you end up
> significantly
> > > > > > > > > > over-provisioning
> > > > > > > > > > >    one pool (both of which are hard to avoid).
> > > > > > > > > > >
> > > > > > > > > > > An alternative for CPU throttling would be to use this
> > api:
> > > > > > > > > > > http://docs.oracle.com/javase/
> 1.5.0/docs/api/java/lang/
> > > > > > > > > > > management/ThreadMXBean.html#getThreadCpuTime(long)
> > > > > > > > > > >
> > > > > > > > > > > That would let you track actual CPU usage across the
> > > network,
> > > > > I/O
> > > > > > > > > > threads,
> > > > > > > > > > > and purgatory threads and look at it as a percentage of
> > > total
> > > > > > > cores.
> > > > > > > > I
> > > > > > > > > > > think this fixes many problems in the reliability of
> the
> > > > > metric.
> > > > > > > It's
> > > > > > > > > > > meaning is slightly different as it is just CPU (you
> > don't
> > > > get
> > > > > > > > charged
> > > > > > > > > > for
> > > > > > > > > > > time blocking on I/O) but that may be okay because we
> > > already
> > > > > > have
> > > > > > > a
> > > > > > > > > > > throttle on I/O. The downside is I think it is possible
> > > this
> > > > > api
> > > > > > > can
> > > > > > > > be
> > > > > > > > > > > disabled or isn't always available and it may also be
> > > > expensive
> > > > > > > (also
> > > > > > > > > > I've
> > > > > > > > > > > never used it so not sure if it really works the way i
> > > > think).
> > > > > > > > > > >
> > > > > > > > > > > -Jay
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> > > > > > becket.qin@gmail.com>
> > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > If the purpose of the KIP is only to protect the
> > cluster
> > > > from
> > > > > > > being
> > > > > > > > > > > > overwhelmed by crazy clients and is not intended to
> > > address
> > > > > > > > resource
> > > > > > > > > > > > allocation problem among the clients, I am wondering
> if
> > > > using
> > > > > > > > request
> > > > > > > > > > > > handling time quota (CPU time quota) is a better
> > option.
> > > > Here
> > > > > > are
> > > > > > > > the
> > > > > > > > > > > > reasons:
> > > > > > > > > > > >
> > > > > > > > > > > > 1. request handling time quota has better protection.
> > Say
> > > > we
> > > > > > have
> > > > > > > > > > request
> > > > > > > > > > > > rate quota and set that to some value like 100
> > > > requests/sec,
> > > > > it
> > > > > > > is
> > > > > > > > > > > possible
> > > > > > > > > > > > that some of the requests are very expensive actually
> > > take
> > > > a
> > > > > > lot
> > > > > > > of
> > > > > > > > > > time
> > > > > > > > > > > to
> > > > > > > > > > > > handle. In that case a few clients may still occupy a
> > lot
> > > > of
> > > > > > CPU
> > > > > > > > time
> > > > > > > > > > > even
> > > > > > > > > > > > the request rate is low. Arguably we can carefully
> set
> > > > > request
> > > > > > > rate
> > > > > > > > > > quota
> > > > > > > > > > > > for each request and client id combination, but it
> > could
> > > > > still
> > > > > > be
> > > > > > > > > > tricky
> > > > > > > > > > > to
> > > > > > > > > > > > get it right for everyone.
> > > > > > > > > > > >
> > > > > > > > > > > > If we use the request time handling quota, we can
> > simply
> > > > say
> > > > > no
> > > > > > > > > clients
> > > > > > > > > > > can
> > > > > > > > > > > > take up to more than 30% of the total request
> handling
> > > > > capacity
> > > > > > > > > > (measured
> > > > > > > > > > > > by time), regardless of the difference among
> different
> > > > > requests
> > > > > > > or
> > > > > > > > > what
> > > > > > > > > > > is
> > > > > > > > > > > > the client doing. In this case maybe we can quota all
> > the
> > > > > > > requests
> > > > > > > > if
> > > > > > > > > > we
> > > > > > > > > > > > want to.
> > > > > > > > > > > >
> > > > > > > > > > > > 2. The main benefit of using request rate limit is
> that
> > > it
> > > > > > seems
> > > > > > > > more
> > > > > > > > > > > > intuitive. It is true that it is probably easier to
> > > explain
> > > > > to
> > > > > > > the
> > > > > > > > > user
> > > > > > > > > > > > what does that mean. However, in practice it looks
> the
> > > > impact
> > > > > > of
> > > > > > > > > > request
> > > > > > > > > > > > rate quota is not more quantifiable than the request
> > > > handling
> > > > > > > time
> > > > > > > > > > quota.
> > > > > > > > > > > > Unlike the byte rate quota, it is still difficult to
> > > give a
> > > > > > > number
> > > > > > > > > > about
> > > > > > > > > > > > impact of throughput or latency when a request rate
> > quota
> > > > is
> > > > > > hit.
> > > > > > > > So
> > > > > > > > > it
> > > > > > > > > > > is
> > > > > > > > > > > > not better than the request handling time quota. In
> > fact
> > > I
> > > > > feel
> > > > > > > it
> > > > > > > > is
> > > > > > > > > > > > clearer to tell user that "you are limited because
> you
> > > have
> > > > > > taken
> > > > > > > > 30%
> > > > > > > > > > of
> > > > > > > > > > > > the CPU time on the broker" than otherwise something
> > like
> > > > > "your
> > > > > > > > > request
> > > > > > > > > > > > rate quota on metadata request has reached".
> > > > > > > > > > > >
> > > > > > > > > > > > Thanks,
> > > > > > > > > > > >
> > > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <
> > > > jay@confluent.io
> > > > > >
> > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > I think this proposal makes a lot of sense
> > (especially
> > > > now
> > > > > > that
> > > > > > > > it
> > > > > > > > > is
> > > > > > > > > > > > > oriented around request rate) and fills the biggest
> > > > > remaining
> > > > > > > gap
> > > > > > > > > in
> > > > > > > > > > > the
> > > > > > > > > > > > > multi-tenancy story.
> > > > > > > > > > > > >
> > > > > > > > > > > > > I think for intra-cluster communication
> (StopReplica,
> > > > etc)
> > > > > we
> > > > > > > > could
> > > > > > > > > > > avoid
> > > > > > > > > > > > > throttling entirely. You can secure or otherwise
> > > > lock-down
> > > > > > the
> > > > > > > > > > cluster
> > > > > > > > > > > > > communication to avoid any unauthorized external
> > party
> > > > from
> > > > > > > > trying
> > > > > > > > > to
> > > > > > > > > > > > > initiate these requests. As a result we are as
> likely
> > > to
> > > > > > cause
> > > > > > > > > > problems
> > > > > > > > > > > > as
> > > > > > > > > > > > > solve them by throttling these, right?
> > > > > > > > > > > > >
> > > > > > > > > > > > > I'm not so sure that we should exempt the consumer
> > > > requests
> > > > > > > such
> > > > > > > > as
> > > > > > > > > > > > > heartbeat. It's true that if we throttle an app's
> > > > heartbeat
> > > > > > > > > requests
> > > > > > > > > > it
> > > > > > > > > > > > may
> > > > > > > > > > > > > cause it to fall out of its consumer group. However
> > if
> > > we
> > > > > > don't
> > > > > > > > > > > throttle
> > > > > > > > > > > > it
> > > > > > > > > > > > > it may DDOS the cluster if the heartbeat interval
> is
> > > set
> > > > > > > > > incorrectly
> > > > > > > > > > or
> > > > > > > > > > > > if
> > > > > > > > > > > > > some client in some language has a bug. I think the
> > > > policy
> > > > > > with
> > > > > > > > > this
> > > > > > > > > > > kind
> > > > > > > > > > > > > of throttling is to protect the cluster above any
> > > > > individual
> > > > > > > app,
> > > > > > > > > > > right?
> > > > > > > > > > > > I
> > > > > > > > > > > > > think in general this should be okay since for most
> > > > > > deployments
> > > > > > > > > this
> > > > > > > > > > > > > setting is meant as more of a safety valve---that
> is
> > > > rather
> > > > > > > than
> > > > > > > > > set
> > > > > > > > > > > > > something very close to what you expect to need
> (say
> > 2
> > > > > > req/sec
> > > > > > > or
> > > > > > > > > > > > whatever)
> > > > > > > > > > > > > you would have something quite high (like 100
> > req/sec)
> > > > with
> > > > > > > this
> > > > > > > > > > meant
> > > > > > > > > > > to
> > > > > > > > > > > > > prevent a client gone crazy. I think when used this
> > way
> > > > > > > allowing
> > > > > > > > > > those
> > > > > > > > > > > to
> > > > > > > > > > > > > be throttled would actually provide meaningful
> > > > protection.
> > > > > > > > > > > > >
> > > > > > > > > > > > > -Jay
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > > >
> > > > > > > > > > > > > wrote:
> > > > > > > > > > > > >
> > > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > I have just created KIP-124 to introduce request
> > rate
> > > > > > quotas
> > > > > > > to
> > > > > > > > > > > Kafka:
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > https://cwiki.apache.org/
> > > confluence/display/KAFKA/KIP-
> > > > > > > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > The proposal is for a simple percentage request
> > > > handling
> > > > > > time
> > > > > > > > > quota
> > > > > > > > > > > > that
> > > > > > > > > > > > > > can be allocated to *<client-id>*, *<user>* or
> > > *<user,
> > > > > > > > > client-id>*.
> > > > > > > > > > > > There
> > > > > > > > > > > > > > are a few other suggestions also under "Rejected
> > > > > > > alternatives".
> > > > > > > > > > > > Feedback
> > > > > > > > > > > > > > and suggestions are welcome.
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Thank you...
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Regards,
> > > > > > > > > > > > > >
> > > > > > > > > > > > > > Rajini
> > > > > > > > > > > > > >
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > --
> > > > > > > > > > -- Guozhang
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Guozhang Wang <wa...@gmail.com>.
Made a pass over the doc, overall LGTM except a minor comment on the
throttling implementation:

Stated as "Request processing time throttling will be applied on top if
necessary." I thought that it meant the request processing time throttling
is applied first, but continue reading I found it actually meant to apply
produce / fetch byte rate throttling first.

Also the last sentence "The remaining delay if any is applied to the
response." is a bit confusing to me. Maybe rewording it a bit?


Guozhang


On Wed, Feb 22, 2017 at 3:24 PM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Rajini,
>
> Thanks for the updated KIP. The latest proposal looks good to me.
>
> Jun
>
> On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Jun/Roger,
> >
> > Thank you for the feedback.
> >
> > 1. I have updated the KIP to use absolute units instead of percentage.
> The
> > property is called* io_thread_units* to align with the thread count
> > property *num.io.threads*. When we implement network thread utilization
> > quotas, we can add another property *network_thread_units.*
> >
> > 2. ControlledShutdown is already listed under the exempt requests. Jun,
> did
> > you mean a different request that needs to be added? The four requests
> > currently exempt in the KIP are StopReplica, ControlledShutdown,
> > LeaderAndIsr and UpdateMetadata. These are controlled using ClusterAction
> > ACL, so it is easy to exclude and only throttle if unauthorized. I wasn't
> > sure if there are other requests used only for inter-broker that needed
> to
> > be excluded.
> >
> > 3. I was thinking the smallest change would be to replace all references
> to
> > *requestChannel.sendResponse()* with a local method
> > *sendResponseMaybeThrottle()* that does the throttling if any plus send
> > response. If we throttle first in *KafkaApis.handle()*, the time spent
> > within the method handling the request will not be recorded or used in
> > throttling. We can look into this again when the PR is ready for review.
> >
> > Regards,
> >
> > Rajini
> >
> >
> >
> > On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <ro...@gmail.com>
> > wrote:
> >
> > > Great to see this KIP and the excellent discussion.
> > >
> > > To me, Jun's suggestion makes sense.  If my application is allocated 1
> > > request handler unit, then it's as if I have a Kafka broker with a
> single
> > > request handler thread dedicated to me.  That's the most I can use, at
> > > least.  That allocation doesn't change even if an admin later increases
> > the
> > > size of the request thread pool on the broker.  It's similar to the CPU
> > > abstraction that VMs and containers get from hypervisors or OS
> > schedulers.
> > > While different client access patterns can use wildly different amounts
> > of
> > > request thread resources per request, a given application will
> generally
> > > have a stable access pattern and can figure out empirically how many
> > > "request thread units" it needs to meet it's throughput/latency goals.
> > >
> > > Cheers,
> > >
> > > Roger
> > >
> > > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Rajini,
> > > >
> > > > Thanks for the updated KIP. A few more comments.
> > > >
> > > > 1. A concern of request_time_percent is that it's not an absolute
> > value.
> > > > Let's say you give a user a 10% limit. If the admin doubles the
> number
> > of
> > > > request handler threads, that user now actually has twice the
> absolute
> > > > capacity. This may confuse people a bit. So, perhaps setting the
> quota
> > > > based on an absolute request thread unit is better.
> > > >
> > > > 2. ControlledShutdownRequest is also an inter-broker request and
> needs
> > to
> > > > be excluded from throttling.
> > > >
> > > > 3. Implementation wise, I am wondering if it's simpler to apply the
> > > request
> > > > time throttling first in KafkaApis.handle(). Otherwise, we will need
> to
> > > add
> > > > the throttling logic in each type of request.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> > rajinisivaram@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Jun,
> > > > >
> > > > > Thank you for the review.
> > > > >
> > > > > I have reverted to the original KIP that throttles based on request
> > > > handler
> > > > > utilization. At the moment, it uses percentage, but I am happy to
> > > change
> > > > to
> > > > > a fraction (out of 1 instead of 100) if required. I have added the
> > > > examples
> > > > > from this discussion to the KIP. Also added a "Future Work" section
> > to
> > > > > address network thread utilization. The configuration is named
> > > > > "request_time_percent" with the expectation that it can also be
> used
> > as
> > > > the
> > > > > limit for network thread utilization when that is implemented, so
> > that
> > > > > users have to set only one config for the two and not have to worry
> > > about
> > > > > the internal distribution of the work between the two thread pools
> in
> > > > > Kafka.
> > > > >
> > > > >
> > > > > Regards,
> > > > >
> > > > > Rajini
> > > > >
> > > > >
> > > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <ju...@confluent.io>
> wrote:
> > > > >
> > > > > > Hi, Rajini,
> > > > > >
> > > > > > Thanks for the proposal.
> > > > > >
> > > > > > The benefit of using the request processing time over the request
> > > rate
> > > > is
> > > > > > exactly what people have said. I will just expand that a bit.
> > > Consider
> > > > > the
> > > > > > following case. The producer sends a produce request with a 10MB
> > > > message
> > > > > > but compressed to 100KB with gzip. The decompression of the
> message
> > > on
> > > > > the
> > > > > > broker could take 10-15 seconds, during which time, a request
> > handler
> > > > > > thread is completely blocked. In this case, neither the byte-in
> > quota
> > > > nor
> > > > > > the request rate quota may be effective in protecting the broker.
> > > > > Consider
> > > > > > another case. A consumer group starts with 10 instances and later
> > on
> > > > > > switches to 20 instances. The request rate will likely double,
> but
> > > the
> > > > > > actually load on the broker may not double since each fetch
> request
> > > > only
> > > > > > contains half of the partitions. Request rate quota may not be
> easy
> > > to
> > > > > > configure in this case.
> > > > > >
> > > > > > What we really want is to be able to prevent a client from using
> > too
> > > > much
> > > > > > of the server side resources. In this particular KIP, this
> resource
> > > is
> > > > > the
> > > > > > capacity of the request handler threads. I agree that it may not
> be
> > > > > > intuitive for the users to determine how to set the right limit.
> > > > However,
> > > > > > this is not completely new and has been done in the container
> world
> > > > > > already. For example, Linux cgroup (https://access.redhat.com/
> > > > > > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > > > > > Resource_Management_Guide/sec-cpu.html) has the concept of
> > > > > > cpu.cfs_quota_us,
> > > > > > which specifies the total amount of time in microseconds for
> which
> > > all
> > > > > > tasks in a cgroup can run during a one second period. We can
> > > > potentially
> > > > > > model the request handler threads in a similar way. For example,
> > each
> > > > > > request handler thread can be 1 request handler unit and the
> admin
> > > can
> > > > > > configure a limit on how many units (say 0.01) a client can have.
> > > > > >
> > > > > > Regarding not throttling the internal broker to broker requests.
> We
> > > > could
> > > > > > do that. Alternatively, we could just let the admin configure a
> > high
> > > > > limit
> > > > > > for the kafka user (it may not be able to do that easily based on
> > > > > clientId
> > > > > > though).
> > > > > >
> > > > > > Ideally we want to be able to protect the utilization of the
> > network
> > > > > thread
> > > > > > pool too. The difficult is mostly what Rajini said: (1) The
> > mechanism
> > > > for
> > > > > > throttling the requests is through Purgatory and we will have to
> > > think
> > > > > > through how to integrate that into the network layer.  (2) In the
> > > > network
> > > > > > layer, currently we know the user, but not the clientId of the
> > > request.
> > > > > So,
> > > > > > it's a bit tricky to throttle based on clientId there. Plus, the
> > > > byteOut
> > > > > > quota can already protect the network thread utilization for
> fetch
> > > > > > requests. So, if we can't figure out this part right now, just
> > > focusing
> > > > > on
> > > > > > the request handling threads for this KIP is still a useful
> > feature.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jun
> > > > > >
> > > > > >
> > > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Thank you all for the feedback.
> > > > > > >
> > > > > > > Jay: I have removed exemption for consumer heartbeat etc. Agree
> > > that
> > > > > > > protecting the cluster is more important than protecting
> > individual
> > > > > apps.
> > > > > > > Have retained the exemption for StopReplicat/LeaderAndIsr etc,
> > > these
> > > > > are
> > > > > > > throttled only if authorization fails (so can't be used for DoS
> > > > attacks
> > > > > > in
> > > > > > > a secure cluster, but allows inter-broker requests to complete
> > > > without
> > > > > > > delays).
> > > > > > >
> > > > > > > I will wait another day to see if these is any objection to
> > quotas
> > > > > based
> > > > > > on
> > > > > > > request processing time (as opposed to request rate) and if
> there
> > > are
> > > > > no
> > > > > > > objections, I will revert to the original proposal with some
> > > changes.
> > > > > > >
> > > > > > > The original proposal was only including the time used by the
> > > request
> > > > > > > handler threads (that made calculation easy). I think the
> > > suggestion
> > > > is
> > > > > > to
> > > > > > > include the time spent in the network threads as well since
> that
> > > may
> > > > be
> > > > > > > significant. As Jay pointed out, it is more complicated to
> > > calculate
> > > > > the
> > > > > > > total available CPU time and convert to a ratio when there *m*
> > I/O
> > > > > > threads
> > > > > > > and *n* network threads. ThreadMXBean#getThreadCPUTime() may
> > give
> > > us
> > > > > > what
> > > > > > > we want, but it can be very expensive on some platforms. As
> > Becket
> > > > and
> > > > > > > Guozhang have pointed out, we do have several time measurements
> > > > already
> > > > > > for
> > > > > > > generating metrics that we could use, though we might want to
> > > switch
> > > > to
> > > > > > > nanoTime() instead of currentTimeMillis() since some of the
> > values
> > > > for
> > > > > > > small requests may be < 1ms. But rather than add up the time
> > spent
> > > in
> > > > > I/O
> > > > > > > thread and network thread, wouldn't it be better to convert the
> > > time
> > > > > > spent
> > > > > > > on each thread into a separate ratio? UserA has a request quota
> > of
> > > > 5%.
> > > > > > Can
> > > > > > > we take that to mean that UserA can use 5% of the time on
> network
> > > > > threads
> > > > > > > and 5% of the time on I/O threads? If either is exceeded, the
> > > > response
> > > > > is
> > > > > > > throttled - it would mean maintaining two sets of metrics for
> the
> > > two
> > > > > > > durations, but would result in more meaningful ratios. We could
> > > > define
> > > > > > two
> > > > > > > quota limits (UserA has 5% of request threads and 10% of
> network
> > > > > > threads),
> > > > > > > but that seems unnecessary and harder to explain to users.
> > > > > > >
> > > > > > > Back to why and how quotas are applied to network thread
> > > utilization:
> > > > > > > a) In the case of fetch,  the time spent in the network thread
> > may
> > > be
> > > > > > > significant and I can see the need to include this. Are there
> > other
> > > > > > > requests where the network thread utilization is significant?
> In
> > > the
> > > > > case
> > > > > > > of fetch, request handler thread utilization would throttle
> > clients
> > > > > with
> > > > > > > high request rate, low data volume and fetch byte rate quota
> will
> > > > > > throttle
> > > > > > > clients with high data volume. Network thread utilization is
> > > perhaps
> > > > > > > proportional to the data volume. I am wondering if we even need
> > to
> > > > > > throttle
> > > > > > > based on network thread utilization or whether the data volume
> > > quota
> > > > > > covers
> > > > > > > this case.
> > > > > > >
> > > > > > > b) At the moment, we record and check for quota violation at
> the
> > > same
> > > > > > time.
> > > > > > > If a quota is violated, the response is delayed. Using Jay'e
> > > example
> > > > of
> > > > > > > disk reads for fetches happening in the network thread, We
> can't
> > > > record
> > > > > > and
> > > > > > > delay a response after the disk reads. We could record the time
> > > spent
> > > > > on
> > > > > > > the network thread when the response is complete and introduce
> a
> > > > delay
> > > > > > for
> > > > > > > handling a subsequent request (separate out recording and quota
> > > > > violation
> > > > > > > handling in the case of network thread overload). Does that
> make
> > > > sense?
> > > > > > >
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > Rajini
> > > > > > >
> > > > > > >
> > > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> > becket.qin@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > Hey Jay,
> > > > > > > >
> > > > > > > > Yeah, I agree that enforcing the CPU time is a little
> tricky. I
> > > am
> > > > > > > thinking
> > > > > > > > that maybe we can use the existing request statistics. They
> are
> > > > > already
> > > > > > > > very detailed so we can probably see the approximate CPU time
> > > from
> > > > > it,
> > > > > > > e.g.
> > > > > > > > something like (total_time - request/response_queue_time -
> > > > > > remote_time).
> > > > > > > >
> > > > > > > > I agree with Guozhang that when a user is throttled it is
> > likely
> > > > that
> > > > > > we
> > > > > > > > need to see if anything has went wrong first, and if the
> users
> > > are
> > > > > well
> > > > > > > > behaving and just need more resources, we will have to bump
> up
> > > the
> > > > > > quota
> > > > > > > > for them. It is true that pre-allocating CPU time quota
> > precisely
> > > > for
> > > > > > the
> > > > > > > > users is difficult. So in practice it would probably be more
> > like
> > > > > first
> > > > > > > set
> > > > > > > > a relative high protective CPU time quota for everyone and
> > > increase
> > > > > > that
> > > > > > > > for some individual clients on demand.
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <
> > > wangguoz@gmail.com
> > > > >
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > This is a great proposal, glad to see it happening.
> > > > > > > > >
> > > > > > > > > I am inclined to the CPU throttling, or more specifically
> > > > > processing
> > > > > > > time
> > > > > > > > > ratio instead of the request rate throttling as well.
> Becket
> > > has
> > > > > very
> > > > > > > > well
> > > > > > > > > summed my rationales above, and one thing to add here is
> that
> > > the
> > > > > > > former
> > > > > > > > > has a good support for both "protecting against rogue
> > clients"
> > > as
> > > > > > well
> > > > > > > as
> > > > > > > > > "utilizing a cluster for multi-tenancy usage": when
> thinking
> > > > about
> > > > > > how
> > > > > > > to
> > > > > > > > > explain this to the end users, I find it actually more
> > natural
> > > > than
> > > > > > the
> > > > > > > > > request rate since as mentioned above, different requests
> > will
> > > > have
> > > > > > > quite
> > > > > > > > > different "cost", and Kafka today already have various
> > request
> > > > > types
> > > > > > > > > (produce, fetch, admin, metadata, etc), because of that the
> > > > request
> > > > > > > rate
> > > > > > > > > throttling may not be as effective unless it is set very
> > > > > > > conservatively.
> > > > > > > > >
> > > > > > > > > Regarding to user reactions when they are throttled, I
> think
> > it
> > > > may
> > > > > > > > differ
> > > > > > > > > case-by-case, and need to be discovered / guided by looking
> > at
> > > > > > relative
> > > > > > > > > metrics. So in other words users would not expect to get
> > > > additional
> > > > > > > > > information by simply being told "hey, you are throttled",
> > > which
> > > > is
> > > > > > all
> > > > > > > > > what throttling does; they need to take a follow-up step
> and
> > > see
> > > > > > "hmm,
> > > > > > > > I'm
> > > > > > > > > throttled probably because of ..", which is by looking at
> > other
> > > > > > metric
> > > > > > > > > values: e.g. whether I'm bombarding the brokers with
> metadata
> > > > > > request,
> > > > > > > > > which are usually cheap to handle but I'm sending thousands
> > per
> > > > > > second;
> > > > > > > > or
> > > > > > > > > is it because I'm catching up and hence sending very heavy
> > > > fetching
> > > > > > > > request
> > > > > > > > > with large min.bytes, etc.
> > > > > > > > >
> > > > > > > > > Regarding to the implementation, as once discussed with
> Jun,
> > > this
> > > > > > seems
> > > > > > > > not
> > > > > > > > > very difficult since today we are already collecting the
> > > "thread
> > > > > pool
> > > > > > > > > utilization" metrics, which is a single percentage
> > > > > > "aggregateIdleMeter"
> > > > > > > > > value; but we are already effectively aggregating it for
> each
> > > > > > requests
> > > > > > > in
> > > > > > > > > KafkaRequestHandler, and we can just extend it by recording
> > the
> > > > > > source
> > > > > > > > > client id when handling them and aggregating by clientId as
> > > well
> > > > as
> > > > > > the
> > > > > > > > > total aggregate.
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > Guozhang
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <
> jay@confluent.io
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hey Becket/Rajini,
> > > > > > > > > >
> > > > > > > > > > When I thought about it more deeply I came around to the
> > > > "percent
> > > > > > of
> > > > > > > > > > processing time" metric too. It seems a lot closer to the
> > > thing
> > > > > we
> > > > > > > > > actually
> > > > > > > > > > care about and need to protect. I also think this would
> be
> > a
> > > > very
> > > > > > > > useful
> > > > > > > > > > metric even in the absence of throttling just to debug
> > whose
> > > > > using
> > > > > > > > > > capacity.
> > > > > > > > > >
> > > > > > > > > > Two problems to consider:
> > > > > > > > > >
> > > > > > > > > >    1. I agree that for the user it is understandable what
> > > lead
> > > > to
> > > > > > > their
> > > > > > > > > >    being throttled, but it is a bit hard to figure out
> the
> > > safe
> > > > > > range
> > > > > > > > for
> > > > > > > > > >    them. i.e. if I have a new app that will send 200
> > > > > messages/sec I
> > > > > > > can
> > > > > > > > > >    probably reason that I'll be under the throttling
> limit
> > of
> > > > 300
> > > > > > > > > req/sec.
> > > > > > > > > >    However if I need to be under a 10% CPU resources
> limit
> > it
> > > > may
> > > > > > be
> > > > > > > a
> > > > > > > > > bit
> > > > > > > > > >    harder for me to know a priori if i will or won't.
> > > > > > > > > >    2. Calculating the available CPU time is a bit
> difficult
> > > > since
> > > > > > > there
> > > > > > > > > are
> > > > > > > > > >    actually two thread pools--the I/O threads and the
> > network
> > > > > > > threads.
> > > > > > > > I
> > > > > > > > > > think
> > > > > > > > > >    it might be workable to count just the I/O thread time
> > as
> > > in
> > > > > the
> > > > > > > > > > proposal,
> > > > > > > > > >    but the network thread work is actually non-trivial
> > (e.g.
> > > > all
> > > > > > the
> > > > > > > > disk
> > > > > > > > > >    reads for fetches happen in that thread). If you count
> > > both
> > > > > the
> > > > > > > > > network
> > > > > > > > > > and
> > > > > > > > > >    I/O threads it can skew things a bit. E.g. say you
> have
> > 50
> > > > > > network
> > > > > > > > > > threads,
> > > > > > > > > >    10 I/O threads, and 8 cores, what is the available cpu
> > > time
> > > > > > > > available
> > > > > > > > > > in a
> > > > > > > > > >    second? I suppose this is a problem whenever you have
> a
> > > > > > bottleneck
> > > > > > > > > > between
> > > > > > > > > >    I/O and network threads or if you end up significantly
> > > > > > > > > over-provisioning
> > > > > > > > > >    one pool (both of which are hard to avoid).
> > > > > > > > > >
> > > > > > > > > > An alternative for CPU throttling would be to use this
> api:
> > > > > > > > > > http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/
> > > > > > > > > > management/ThreadMXBean.html#getThreadCpuTime(long)
> > > > > > > > > >
> > > > > > > > > > That would let you track actual CPU usage across the
> > network,
> > > > I/O
> > > > > > > > > threads,
> > > > > > > > > > and purgatory threads and look at it as a percentage of
> > total
> > > > > > cores.
> > > > > > > I
> > > > > > > > > > think this fixes many problems in the reliability of the
> > > > metric.
> > > > > > It's
> > > > > > > > > > meaning is slightly different as it is just CPU (you
> don't
> > > get
> > > > > > > charged
> > > > > > > > > for
> > > > > > > > > > time blocking on I/O) but that may be okay because we
> > already
> > > > > have
> > > > > > a
> > > > > > > > > > throttle on I/O. The downside is I think it is possible
> > this
> > > > api
> > > > > > can
> > > > > > > be
> > > > > > > > > > disabled or isn't always available and it may also be
> > > expensive
> > > > > > (also
> > > > > > > > > I've
> > > > > > > > > > never used it so not sure if it really works the way i
> > > think).
> > > > > > > > > >
> > > > > > > > > > -Jay
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> > > > > becket.qin@gmail.com>
> > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > If the purpose of the KIP is only to protect the
> cluster
> > > from
> > > > > > being
> > > > > > > > > > > overwhelmed by crazy clients and is not intended to
> > address
> > > > > > > resource
> > > > > > > > > > > allocation problem among the clients, I am wondering if
> > > using
> > > > > > > request
> > > > > > > > > > > handling time quota (CPU time quota) is a better
> option.
> > > Here
> > > > > are
> > > > > > > the
> > > > > > > > > > > reasons:
> > > > > > > > > > >
> > > > > > > > > > > 1. request handling time quota has better protection.
> Say
> > > we
> > > > > have
> > > > > > > > > request
> > > > > > > > > > > rate quota and set that to some value like 100
> > > requests/sec,
> > > > it
> > > > > > is
> > > > > > > > > > possible
> > > > > > > > > > > that some of the requests are very expensive actually
> > take
> > > a
> > > > > lot
> > > > > > of
> > > > > > > > > time
> > > > > > > > > > to
> > > > > > > > > > > handle. In that case a few clients may still occupy a
> lot
> > > of
> > > > > CPU
> > > > > > > time
> > > > > > > > > > even
> > > > > > > > > > > the request rate is low. Arguably we can carefully set
> > > > request
> > > > > > rate
> > > > > > > > > quota
> > > > > > > > > > > for each request and client id combination, but it
> could
> > > > still
> > > > > be
> > > > > > > > > tricky
> > > > > > > > > > to
> > > > > > > > > > > get it right for everyone.
> > > > > > > > > > >
> > > > > > > > > > > If we use the request time handling quota, we can
> simply
> > > say
> > > > no
> > > > > > > > clients
> > > > > > > > > > can
> > > > > > > > > > > take up to more than 30% of the total request handling
> > > > capacity
> > > > > > > > > (measured
> > > > > > > > > > > by time), regardless of the difference among different
> > > > requests
> > > > > > or
> > > > > > > > what
> > > > > > > > > > is
> > > > > > > > > > > the client doing. In this case maybe we can quota all
> the
> > > > > > requests
> > > > > > > if
> > > > > > > > > we
> > > > > > > > > > > want to.
> > > > > > > > > > >
> > > > > > > > > > > 2. The main benefit of using request rate limit is that
> > it
> > > > > seems
> > > > > > > more
> > > > > > > > > > > intuitive. It is true that it is probably easier to
> > explain
> > > > to
> > > > > > the
> > > > > > > > user
> > > > > > > > > > > what does that mean. However, in practice it looks the
> > > impact
> > > > > of
> > > > > > > > > request
> > > > > > > > > > > rate quota is not more quantifiable than the request
> > > handling
> > > > > > time
> > > > > > > > > quota.
> > > > > > > > > > > Unlike the byte rate quota, it is still difficult to
> > give a
> > > > > > number
> > > > > > > > > about
> > > > > > > > > > > impact of throughput or latency when a request rate
> quota
> > > is
> > > > > hit.
> > > > > > > So
> > > > > > > > it
> > > > > > > > > > is
> > > > > > > > > > > not better than the request handling time quota. In
> fact
> > I
> > > > feel
> > > > > > it
> > > > > > > is
> > > > > > > > > > > clearer to tell user that "you are limited because you
> > have
> > > > > taken
> > > > > > > 30%
> > > > > > > > > of
> > > > > > > > > > > the CPU time on the broker" than otherwise something
> like
> > > > "your
> > > > > > > > request
> > > > > > > > > > > rate quota on metadata request has reached".
> > > > > > > > > > >
> > > > > > > > > > > Thanks,
> > > > > > > > > > >
> > > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <
> > > jay@confluent.io
> > > > >
> > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > I think this proposal makes a lot of sense
> (especially
> > > now
> > > > > that
> > > > > > > it
> > > > > > > > is
> > > > > > > > > > > > oriented around request rate) and fills the biggest
> > > > remaining
> > > > > > gap
> > > > > > > > in
> > > > > > > > > > the
> > > > > > > > > > > > multi-tenancy story.
> > > > > > > > > > > >
> > > > > > > > > > > > I think for intra-cluster communication (StopReplica,
> > > etc)
> > > > we
> > > > > > > could
> > > > > > > > > > avoid
> > > > > > > > > > > > throttling entirely. You can secure or otherwise
> > > lock-down
> > > > > the
> > > > > > > > > cluster
> > > > > > > > > > > > communication to avoid any unauthorized external
> party
> > > from
> > > > > > > trying
> > > > > > > > to
> > > > > > > > > > > > initiate these requests. As a result we are as likely
> > to
> > > > > cause
> > > > > > > > > problems
> > > > > > > > > > > as
> > > > > > > > > > > > solve them by throttling these, right?
> > > > > > > > > > > >
> > > > > > > > > > > > I'm not so sure that we should exempt the consumer
> > > requests
> > > > > > such
> > > > > > > as
> > > > > > > > > > > > heartbeat. It's true that if we throttle an app's
> > > heartbeat
> > > > > > > > requests
> > > > > > > > > it
> > > > > > > > > > > may
> > > > > > > > > > > > cause it to fall out of its consumer group. However
> if
> > we
> > > > > don't
> > > > > > > > > > throttle
> > > > > > > > > > > it
> > > > > > > > > > > > it may DDOS the cluster if the heartbeat interval is
> > set
> > > > > > > > incorrectly
> > > > > > > > > or
> > > > > > > > > > > if
> > > > > > > > > > > > some client in some language has a bug. I think the
> > > policy
> > > > > with
> > > > > > > > this
> > > > > > > > > > kind
> > > > > > > > > > > > of throttling is to protect the cluster above any
> > > > individual
> > > > > > app,
> > > > > > > > > > right?
> > > > > > > > > > > I
> > > > > > > > > > > > think in general this should be okay since for most
> > > > > deployments
> > > > > > > > this
> > > > > > > > > > > > setting is meant as more of a safety valve---that is
> > > rather
> > > > > > than
> > > > > > > > set
> > > > > > > > > > > > something very close to what you expect to need (say
> 2
> > > > > req/sec
> > > > > > or
> > > > > > > > > > > whatever)
> > > > > > > > > > > > you would have something quite high (like 100
> req/sec)
> > > with
> > > > > > this
> > > > > > > > > meant
> > > > > > > > > > to
> > > > > > > > > > > > prevent a client gone crazy. I think when used this
> way
> > > > > > allowing
> > > > > > > > > those
> > > > > > > > > > to
> > > > > > > > > > > > be throttled would actually provide meaningful
> > > protection.
> > > > > > > > > > > >
> > > > > > > > > > > > -Jay
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > > >
> > > > > > > > > > > > wrote:
> > > > > > > > > > > >
> > > > > > > > > > > > > Hi all,
> > > > > > > > > > > > >
> > > > > > > > > > > > > I have just created KIP-124 to introduce request
> rate
> > > > > quotas
> > > > > > to
> > > > > > > > > > Kafka:
> > > > > > > > > > > > >
> > > > > > > > > > > > > https://cwiki.apache.org/
> > confluence/display/KAFKA/KIP-
> > > > > > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > > > > > >
> > > > > > > > > > > > > The proposal is for a simple percentage request
> > > handling
> > > > > time
> > > > > > > > quota
> > > > > > > > > > > that
> > > > > > > > > > > > > can be allocated to *<client-id>*, *<user>* or
> > *<user,
> > > > > > > > client-id>*.
> > > > > > > > > > > There
> > > > > > > > > > > > > are a few other suggestions also under "Rejected
> > > > > > alternatives".
> > > > > > > > > > > Feedback
> > > > > > > > > > > > > and suggestions are welcome.
> > > > > > > > > > > > >
> > > > > > > > > > > > > Thank you...
> > > > > > > > > > > > >
> > > > > > > > > > > > > Regards,
> > > > > > > > > > > > >
> > > > > > > > > > > > > Rajini
> > > > > > > > > > > > >
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > --
> > > > > > > > > -- Guozhang
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Rajini,

Thanks for the updated KIP. The latest proposal looks good to me.

Jun

On Wed, Feb 22, 2017 at 2:19 PM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Jun/Roger,
>
> Thank you for the feedback.
>
> 1. I have updated the KIP to use absolute units instead of percentage. The
> property is called* io_thread_units* to align with the thread count
> property *num.io.threads*. When we implement network thread utilization
> quotas, we can add another property *network_thread_units.*
>
> 2. ControlledShutdown is already listed under the exempt requests. Jun, did
> you mean a different request that needs to be added? The four requests
> currently exempt in the KIP are StopReplica, ControlledShutdown,
> LeaderAndIsr and UpdateMetadata. These are controlled using ClusterAction
> ACL, so it is easy to exclude and only throttle if unauthorized. I wasn't
> sure if there are other requests used only for inter-broker that needed to
> be excluded.
>
> 3. I was thinking the smallest change would be to replace all references to
> *requestChannel.sendResponse()* with a local method
> *sendResponseMaybeThrottle()* that does the throttling if any plus send
> response. If we throttle first in *KafkaApis.handle()*, the time spent
> within the method handling the request will not be recorded or used in
> throttling. We can look into this again when the PR is ready for review.
>
> Regards,
>
> Rajini
>
>
>
> On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <ro...@gmail.com>
> wrote:
>
> > Great to see this KIP and the excellent discussion.
> >
> > To me, Jun's suggestion makes sense.  If my application is allocated 1
> > request handler unit, then it's as if I have a Kafka broker with a single
> > request handler thread dedicated to me.  That's the most I can use, at
> > least.  That allocation doesn't change even if an admin later increases
> the
> > size of the request thread pool on the broker.  It's similar to the CPU
> > abstraction that VMs and containers get from hypervisors or OS
> schedulers.
> > While different client access patterns can use wildly different amounts
> of
> > request thread resources per request, a given application will generally
> > have a stable access pattern and can figure out empirically how many
> > "request thread units" it needs to meet it's throughput/latency goals.
> >
> > Cheers,
> >
> > Roger
> >
> > On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Rajini,
> > >
> > > Thanks for the updated KIP. A few more comments.
> > >
> > > 1. A concern of request_time_percent is that it's not an absolute
> value.
> > > Let's say you give a user a 10% limit. If the admin doubles the number
> of
> > > request handler threads, that user now actually has twice the absolute
> > > capacity. This may confuse people a bit. So, perhaps setting the quota
> > > based on an absolute request thread unit is better.
> > >
> > > 2. ControlledShutdownRequest is also an inter-broker request and needs
> to
> > > be excluded from throttling.
> > >
> > > 3. Implementation wise, I am wondering if it's simpler to apply the
> > request
> > > time throttling first in KafkaApis.handle(). Otherwise, we will need to
> > add
> > > the throttling logic in each type of request.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <
> rajinisivaram@gmail.com
> > >
> > > wrote:
> > >
> > > > Jun,
> > > >
> > > > Thank you for the review.
> > > >
> > > > I have reverted to the original KIP that throttles based on request
> > > handler
> > > > utilization. At the moment, it uses percentage, but I am happy to
> > change
> > > to
> > > > a fraction (out of 1 instead of 100) if required. I have added the
> > > examples
> > > > from this discussion to the KIP. Also added a "Future Work" section
> to
> > > > address network thread utilization. The configuration is named
> > > > "request_time_percent" with the expectation that it can also be used
> as
> > > the
> > > > limit for network thread utilization when that is implemented, so
> that
> > > > users have to set only one config for the two and not have to worry
> > about
> > > > the internal distribution of the work between the two thread pools in
> > > > Kafka.
> > > >
> > > >
> > > > Regards,
> > > >
> > > > Rajini
> > > >
> > > >
> > > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <ju...@confluent.io> wrote:
> > > >
> > > > > Hi, Rajini,
> > > > >
> > > > > Thanks for the proposal.
> > > > >
> > > > > The benefit of using the request processing time over the request
> > rate
> > > is
> > > > > exactly what people have said. I will just expand that a bit.
> > Consider
> > > > the
> > > > > following case. The producer sends a produce request with a 10MB
> > > message
> > > > > but compressed to 100KB with gzip. The decompression of the message
> > on
> > > > the
> > > > > broker could take 10-15 seconds, during which time, a request
> handler
> > > > > thread is completely blocked. In this case, neither the byte-in
> quota
> > > nor
> > > > > the request rate quota may be effective in protecting the broker.
> > > > Consider
> > > > > another case. A consumer group starts with 10 instances and later
> on
> > > > > switches to 20 instances. The request rate will likely double, but
> > the
> > > > > actually load on the broker may not double since each fetch request
> > > only
> > > > > contains half of the partitions. Request rate quota may not be easy
> > to
> > > > > configure in this case.
> > > > >
> > > > > What we really want is to be able to prevent a client from using
> too
> > > much
> > > > > of the server side resources. In this particular KIP, this resource
> > is
> > > > the
> > > > > capacity of the request handler threads. I agree that it may not be
> > > > > intuitive for the users to determine how to set the right limit.
> > > However,
> > > > > this is not completely new and has been done in the container world
> > > > > already. For example, Linux cgroup (https://access.redhat.com/
> > > > > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > > > > Resource_Management_Guide/sec-cpu.html) has the concept of
> > > > > cpu.cfs_quota_us,
> > > > > which specifies the total amount of time in microseconds for which
> > all
> > > > > tasks in a cgroup can run during a one second period. We can
> > > potentially
> > > > > model the request handler threads in a similar way. For example,
> each
> > > > > request handler thread can be 1 request handler unit and the admin
> > can
> > > > > configure a limit on how many units (say 0.01) a client can have.
> > > > >
> > > > > Regarding not throttling the internal broker to broker requests. We
> > > could
> > > > > do that. Alternatively, we could just let the admin configure a
> high
> > > > limit
> > > > > for the kafka user (it may not be able to do that easily based on
> > > > clientId
> > > > > though).
> > > > >
> > > > > Ideally we want to be able to protect the utilization of the
> network
> > > > thread
> > > > > pool too. The difficult is mostly what Rajini said: (1) The
> mechanism
> > > for
> > > > > throttling the requests is through Purgatory and we will have to
> > think
> > > > > through how to integrate that into the network layer.  (2) In the
> > > network
> > > > > layer, currently we know the user, but not the clientId of the
> > request.
> > > > So,
> > > > > it's a bit tricky to throttle based on clientId there. Plus, the
> > > byteOut
> > > > > quota can already protect the network thread utilization for fetch
> > > > > requests. So, if we can't figure out this part right now, just
> > focusing
> > > > on
> > > > > the request handling threads for this KIP is still a useful
> feature.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jun
> > > > >
> > > > >
> > > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> > > rajinisivaram@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Thank you all for the feedback.
> > > > > >
> > > > > > Jay: I have removed exemption for consumer heartbeat etc. Agree
> > that
> > > > > > protecting the cluster is more important than protecting
> individual
> > > > apps.
> > > > > > Have retained the exemption for StopReplicat/LeaderAndIsr etc,
> > these
> > > > are
> > > > > > throttled only if authorization fails (so can't be used for DoS
> > > attacks
> > > > > in
> > > > > > a secure cluster, but allows inter-broker requests to complete
> > > without
> > > > > > delays).
> > > > > >
> > > > > > I will wait another day to see if these is any objection to
> quotas
> > > > based
> > > > > on
> > > > > > request processing time (as opposed to request rate) and if there
> > are
> > > > no
> > > > > > objections, I will revert to the original proposal with some
> > changes.
> > > > > >
> > > > > > The original proposal was only including the time used by the
> > request
> > > > > > handler threads (that made calculation easy). I think the
> > suggestion
> > > is
> > > > > to
> > > > > > include the time spent in the network threads as well since that
> > may
> > > be
> > > > > > significant. As Jay pointed out, it is more complicated to
> > calculate
> > > > the
> > > > > > total available CPU time and convert to a ratio when there *m*
> I/O
> > > > > threads
> > > > > > and *n* network threads. ThreadMXBean#getThreadCPUTime() may
> give
> > us
> > > > > what
> > > > > > we want, but it can be very expensive on some platforms. As
> Becket
> > > and
> > > > > > Guozhang have pointed out, we do have several time measurements
> > > already
> > > > > for
> > > > > > generating metrics that we could use, though we might want to
> > switch
> > > to
> > > > > > nanoTime() instead of currentTimeMillis() since some of the
> values
> > > for
> > > > > > small requests may be < 1ms. But rather than add up the time
> spent
> > in
> > > > I/O
> > > > > > thread and network thread, wouldn't it be better to convert the
> > time
> > > > > spent
> > > > > > on each thread into a separate ratio? UserA has a request quota
> of
> > > 5%.
> > > > > Can
> > > > > > we take that to mean that UserA can use 5% of the time on network
> > > > threads
> > > > > > and 5% of the time on I/O threads? If either is exceeded, the
> > > response
> > > > is
> > > > > > throttled - it would mean maintaining two sets of metrics for the
> > two
> > > > > > durations, but would result in more meaningful ratios. We could
> > > define
> > > > > two
> > > > > > quota limits (UserA has 5% of request threads and 10% of network
> > > > > threads),
> > > > > > but that seems unnecessary and harder to explain to users.
> > > > > >
> > > > > > Back to why and how quotas are applied to network thread
> > utilization:
> > > > > > a) In the case of fetch,  the time spent in the network thread
> may
> > be
> > > > > > significant and I can see the need to include this. Are there
> other
> > > > > > requests where the network thread utilization is significant? In
> > the
> > > > case
> > > > > > of fetch, request handler thread utilization would throttle
> clients
> > > > with
> > > > > > high request rate, low data volume and fetch byte rate quota will
> > > > > throttle
> > > > > > clients with high data volume. Network thread utilization is
> > perhaps
> > > > > > proportional to the data volume. I am wondering if we even need
> to
> > > > > throttle
> > > > > > based on network thread utilization or whether the data volume
> > quota
> > > > > covers
> > > > > > this case.
> > > > > >
> > > > > > b) At the moment, we record and check for quota violation at the
> > same
> > > > > time.
> > > > > > If a quota is violated, the response is delayed. Using Jay'e
> > example
> > > of
> > > > > > disk reads for fetches happening in the network thread, We can't
> > > record
> > > > > and
> > > > > > delay a response after the disk reads. We could record the time
> > spent
> > > > on
> > > > > > the network thread when the response is complete and introduce a
> > > delay
> > > > > for
> > > > > > handling a subsequent request (separate out recording and quota
> > > > violation
> > > > > > handling in the case of network thread overload). Does that make
> > > sense?
> > > > > >
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Rajini
> > > > > >
> > > > > >
> > > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <
> becket.qin@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > Hey Jay,
> > > > > > >
> > > > > > > Yeah, I agree that enforcing the CPU time is a little tricky. I
> > am
> > > > > > thinking
> > > > > > > that maybe we can use the existing request statistics. They are
> > > > already
> > > > > > > very detailed so we can probably see the approximate CPU time
> > from
> > > > it,
> > > > > > e.g.
> > > > > > > something like (total_time - request/response_queue_time -
> > > > > remote_time).
> > > > > > >
> > > > > > > I agree with Guozhang that when a user is throttled it is
> likely
> > > that
> > > > > we
> > > > > > > need to see if anything has went wrong first, and if the users
> > are
> > > > well
> > > > > > > behaving and just need more resources, we will have to bump up
> > the
> > > > > quota
> > > > > > > for them. It is true that pre-allocating CPU time quota
> precisely
> > > for
> > > > > the
> > > > > > > users is difficult. So in practice it would probably be more
> like
> > > > first
> > > > > > set
> > > > > > > a relative high protective CPU time quota for everyone and
> > increase
> > > > > that
> > > > > > > for some individual clients on demand.
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <
> > wangguoz@gmail.com
> > > >
> > > > > > wrote:
> > > > > > >
> > > > > > > > This is a great proposal, glad to see it happening.
> > > > > > > >
> > > > > > > > I am inclined to the CPU throttling, or more specifically
> > > > processing
> > > > > > time
> > > > > > > > ratio instead of the request rate throttling as well. Becket
> > has
> > > > very
> > > > > > > well
> > > > > > > > summed my rationales above, and one thing to add here is that
> > the
> > > > > > former
> > > > > > > > has a good support for both "protecting against rogue
> clients"
> > as
> > > > > well
> > > > > > as
> > > > > > > > "utilizing a cluster for multi-tenancy usage": when thinking
> > > about
> > > > > how
> > > > > > to
> > > > > > > > explain this to the end users, I find it actually more
> natural
> > > than
> > > > > the
> > > > > > > > request rate since as mentioned above, different requests
> will
> > > have
> > > > > > quite
> > > > > > > > different "cost", and Kafka today already have various
> request
> > > > types
> > > > > > > > (produce, fetch, admin, metadata, etc), because of that the
> > > request
> > > > > > rate
> > > > > > > > throttling may not be as effective unless it is set very
> > > > > > conservatively.
> > > > > > > >
> > > > > > > > Regarding to user reactions when they are throttled, I think
> it
> > > may
> > > > > > > differ
> > > > > > > > case-by-case, and need to be discovered / guided by looking
> at
> > > > > relative
> > > > > > > > metrics. So in other words users would not expect to get
> > > additional
> > > > > > > > information by simply being told "hey, you are throttled",
> > which
> > > is
> > > > > all
> > > > > > > > what throttling does; they need to take a follow-up step and
> > see
> > > > > "hmm,
> > > > > > > I'm
> > > > > > > > throttled probably because of ..", which is by looking at
> other
> > > > > metric
> > > > > > > > values: e.g. whether I'm bombarding the brokers with metadata
> > > > > request,
> > > > > > > > which are usually cheap to handle but I'm sending thousands
> per
> > > > > second;
> > > > > > > or
> > > > > > > > is it because I'm catching up and hence sending very heavy
> > > fetching
> > > > > > > request
> > > > > > > > with large min.bytes, etc.
> > > > > > > >
> > > > > > > > Regarding to the implementation, as once discussed with Jun,
> > this
> > > > > seems
> > > > > > > not
> > > > > > > > very difficult since today we are already collecting the
> > "thread
> > > > pool
> > > > > > > > utilization" metrics, which is a single percentage
> > > > > "aggregateIdleMeter"
> > > > > > > > value; but we are already effectively aggregating it for each
> > > > > requests
> > > > > > in
> > > > > > > > KafkaRequestHandler, and we can just extend it by recording
> the
> > > > > source
> > > > > > > > client id when handling them and aggregating by clientId as
> > well
> > > as
> > > > > the
> > > > > > > > total aggregate.
> > > > > > > >
> > > > > > > >
> > > > > > > > Guozhang
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <jay@confluent.io
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > Hey Becket/Rajini,
> > > > > > > > >
> > > > > > > > > When I thought about it more deeply I came around to the
> > > "percent
> > > > > of
> > > > > > > > > processing time" metric too. It seems a lot closer to the
> > thing
> > > > we
> > > > > > > > actually
> > > > > > > > > care about and need to protect. I also think this would be
> a
> > > very
> > > > > > > useful
> > > > > > > > > metric even in the absence of throttling just to debug
> whose
> > > > using
> > > > > > > > > capacity.
> > > > > > > > >
> > > > > > > > > Two problems to consider:
> > > > > > > > >
> > > > > > > > >    1. I agree that for the user it is understandable what
> > lead
> > > to
> > > > > > their
> > > > > > > > >    being throttled, but it is a bit hard to figure out the
> > safe
> > > > > range
> > > > > > > for
> > > > > > > > >    them. i.e. if I have a new app that will send 200
> > > > messages/sec I
> > > > > > can
> > > > > > > > >    probably reason that I'll be under the throttling limit
> of
> > > 300
> > > > > > > > req/sec.
> > > > > > > > >    However if I need to be under a 10% CPU resources limit
> it
> > > may
> > > > > be
> > > > > > a
> > > > > > > > bit
> > > > > > > > >    harder for me to know a priori if i will or won't.
> > > > > > > > >    2. Calculating the available CPU time is a bit difficult
> > > since
> > > > > > there
> > > > > > > > are
> > > > > > > > >    actually two thread pools--the I/O threads and the
> network
> > > > > > threads.
> > > > > > > I
> > > > > > > > > think
> > > > > > > > >    it might be workable to count just the I/O thread time
> as
> > in
> > > > the
> > > > > > > > > proposal,
> > > > > > > > >    but the network thread work is actually non-trivial
> (e.g.
> > > all
> > > > > the
> > > > > > > disk
> > > > > > > > >    reads for fetches happen in that thread). If you count
> > both
> > > > the
> > > > > > > > network
> > > > > > > > > and
> > > > > > > > >    I/O threads it can skew things a bit. E.g. say you have
> 50
> > > > > network
> > > > > > > > > threads,
> > > > > > > > >    10 I/O threads, and 8 cores, what is the available cpu
> > time
> > > > > > > available
> > > > > > > > > in a
> > > > > > > > >    second? I suppose this is a problem whenever you have a
> > > > > bottleneck
> > > > > > > > > between
> > > > > > > > >    I/O and network threads or if you end up significantly
> > > > > > > > over-provisioning
> > > > > > > > >    one pool (both of which are hard to avoid).
> > > > > > > > >
> > > > > > > > > An alternative for CPU throttling would be to use this api:
> > > > > > > > > http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/
> > > > > > > > > management/ThreadMXBean.html#getThreadCpuTime(long)
> > > > > > > > >
> > > > > > > > > That would let you track actual CPU usage across the
> network,
> > > I/O
> > > > > > > > threads,
> > > > > > > > > and purgatory threads and look at it as a percentage of
> total
> > > > > cores.
> > > > > > I
> > > > > > > > > think this fixes many problems in the reliability of the
> > > metric.
> > > > > It's
> > > > > > > > > meaning is slightly different as it is just CPU (you don't
> > get
> > > > > > charged
> > > > > > > > for
> > > > > > > > > time blocking on I/O) but that may be okay because we
> already
> > > > have
> > > > > a
> > > > > > > > > throttle on I/O. The downside is I think it is possible
> this
> > > api
> > > > > can
> > > > > > be
> > > > > > > > > disabled or isn't always available and it may also be
> > expensive
> > > > > (also
> > > > > > > > I've
> > > > > > > > > never used it so not sure if it really works the way i
> > think).
> > > > > > > > >
> > > > > > > > > -Jay
> > > > > > > > >
> > > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> > > > becket.qin@gmail.com>
> > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > If the purpose of the KIP is only to protect the cluster
> > from
> > > > > being
> > > > > > > > > > overwhelmed by crazy clients and is not intended to
> address
> > > > > > resource
> > > > > > > > > > allocation problem among the clients, I am wondering if
> > using
> > > > > > request
> > > > > > > > > > handling time quota (CPU time quota) is a better option.
> > Here
> > > > are
> > > > > > the
> > > > > > > > > > reasons:
> > > > > > > > > >
> > > > > > > > > > 1. request handling time quota has better protection. Say
> > we
> > > > have
> > > > > > > > request
> > > > > > > > > > rate quota and set that to some value like 100
> > requests/sec,
> > > it
> > > > > is
> > > > > > > > > possible
> > > > > > > > > > that some of the requests are very expensive actually
> take
> > a
> > > > lot
> > > > > of
> > > > > > > > time
> > > > > > > > > to
> > > > > > > > > > handle. In that case a few clients may still occupy a lot
> > of
> > > > CPU
> > > > > > time
> > > > > > > > > even
> > > > > > > > > > the request rate is low. Arguably we can carefully set
> > > request
> > > > > rate
> > > > > > > > quota
> > > > > > > > > > for each request and client id combination, but it could
> > > still
> > > > be
> > > > > > > > tricky
> > > > > > > > > to
> > > > > > > > > > get it right for everyone.
> > > > > > > > > >
> > > > > > > > > > If we use the request time handling quota, we can simply
> > say
> > > no
> > > > > > > clients
> > > > > > > > > can
> > > > > > > > > > take up to more than 30% of the total request handling
> > > capacity
> > > > > > > > (measured
> > > > > > > > > > by time), regardless of the difference among different
> > > requests
> > > > > or
> > > > > > > what
> > > > > > > > > is
> > > > > > > > > > the client doing. In this case maybe we can quota all the
> > > > > requests
> > > > > > if
> > > > > > > > we
> > > > > > > > > > want to.
> > > > > > > > > >
> > > > > > > > > > 2. The main benefit of using request rate limit is that
> it
> > > > seems
> > > > > > more
> > > > > > > > > > intuitive. It is true that it is probably easier to
> explain
> > > to
> > > > > the
> > > > > > > user
> > > > > > > > > > what does that mean. However, in practice it looks the
> > impact
> > > > of
> > > > > > > > request
> > > > > > > > > > rate quota is not more quantifiable than the request
> > handling
> > > > > time
> > > > > > > > quota.
> > > > > > > > > > Unlike the byte rate quota, it is still difficult to
> give a
> > > > > number
> > > > > > > > about
> > > > > > > > > > impact of throughput or latency when a request rate quota
> > is
> > > > hit.
> > > > > > So
> > > > > > > it
> > > > > > > > > is
> > > > > > > > > > not better than the request handling time quota. In fact
> I
> > > feel
> > > > > it
> > > > > > is
> > > > > > > > > > clearer to tell user that "you are limited because you
> have
> > > > taken
> > > > > > 30%
> > > > > > > > of
> > > > > > > > > > the CPU time on the broker" than otherwise something like
> > > "your
> > > > > > > request
> > > > > > > > > > rate quota on metadata request has reached".
> > > > > > > > > >
> > > > > > > > > > Thanks,
> > > > > > > > > >
> > > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <
> > jay@confluent.io
> > > >
> > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > I think this proposal makes a lot of sense (especially
> > now
> > > > that
> > > > > > it
> > > > > > > is
> > > > > > > > > > > oriented around request rate) and fills the biggest
> > > remaining
> > > > > gap
> > > > > > > in
> > > > > > > > > the
> > > > > > > > > > > multi-tenancy story.
> > > > > > > > > > >
> > > > > > > > > > > I think for intra-cluster communication (StopReplica,
> > etc)
> > > we
> > > > > > could
> > > > > > > > > avoid
> > > > > > > > > > > throttling entirely. You can secure or otherwise
> > lock-down
> > > > the
> > > > > > > > cluster
> > > > > > > > > > > communication to avoid any unauthorized external party
> > from
> > > > > > trying
> > > > > > > to
> > > > > > > > > > > initiate these requests. As a result we are as likely
> to
> > > > cause
> > > > > > > > problems
> > > > > > > > > > as
> > > > > > > > > > > solve them by throttling these, right?
> > > > > > > > > > >
> > > > > > > > > > > I'm not so sure that we should exempt the consumer
> > requests
> > > > > such
> > > > > > as
> > > > > > > > > > > heartbeat. It's true that if we throttle an app's
> > heartbeat
> > > > > > > requests
> > > > > > > > it
> > > > > > > > > > may
> > > > > > > > > > > cause it to fall out of its consumer group. However if
> we
> > > > don't
> > > > > > > > > throttle
> > > > > > > > > > it
> > > > > > > > > > > it may DDOS the cluster if the heartbeat interval is
> set
> > > > > > > incorrectly
> > > > > > > > or
> > > > > > > > > > if
> > > > > > > > > > > some client in some language has a bug. I think the
> > policy
> > > > with
> > > > > > > this
> > > > > > > > > kind
> > > > > > > > > > > of throttling is to protect the cluster above any
> > > individual
> > > > > app,
> > > > > > > > > right?
> > > > > > > > > > I
> > > > > > > > > > > think in general this should be okay since for most
> > > > deployments
> > > > > > > this
> > > > > > > > > > > setting is meant as more of a safety valve---that is
> > rather
> > > > > than
> > > > > > > set
> > > > > > > > > > > something very close to what you expect to need (say 2
> > > > req/sec
> > > > > or
> > > > > > > > > > whatever)
> > > > > > > > > > > you would have something quite high (like 100 req/sec)
> > with
> > > > > this
> > > > > > > > meant
> > > > > > > > > to
> > > > > > > > > > > prevent a client gone crazy. I think when used this way
> > > > > allowing
> > > > > > > > those
> > > > > > > > > to
> > > > > > > > > > > be throttled would actually provide meaningful
> > protection.
> > > > > > > > > > >
> > > > > > > > > > > -Jay
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > > >
> > > > > > > > > > > wrote:
> > > > > > > > > > >
> > > > > > > > > > > > Hi all,
> > > > > > > > > > > >
> > > > > > > > > > > > I have just created KIP-124 to introduce request rate
> > > > quotas
> > > > > to
> > > > > > > > > Kafka:
> > > > > > > > > > > >
> > > > > > > > > > > > https://cwiki.apache.org/
> confluence/display/KAFKA/KIP-
> > > > > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > > > > >
> > > > > > > > > > > > The proposal is for a simple percentage request
> > handling
> > > > time
> > > > > > > quota
> > > > > > > > > > that
> > > > > > > > > > > > can be allocated to *<client-id>*, *<user>* or
> *<user,
> > > > > > > client-id>*.
> > > > > > > > > > There
> > > > > > > > > > > > are a few other suggestions also under "Rejected
> > > > > alternatives".
> > > > > > > > > > Feedback
> > > > > > > > > > > > and suggestions are welcome.
> > > > > > > > > > > >
> > > > > > > > > > > > Thank you...
> > > > > > > > > > > >
> > > > > > > > > > > > Regards,
> > > > > > > > > > > >
> > > > > > > > > > > > Rajini
> > > > > > > > > > > >
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > --
> > > > > > > > -- Guozhang
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Jun/Roger,

Thank you for the feedback.

1. I have updated the KIP to use absolute units instead of percentage. The
property is called* io_thread_units* to align with the thread count
property *num.io.threads*. When we implement network thread utilization
quotas, we can add another property *network_thread_units.*

2. ControlledShutdown is already listed under the exempt requests. Jun, did
you mean a different request that needs to be added? The four requests
currently exempt in the KIP are StopReplica, ControlledShutdown,
LeaderAndIsr and UpdateMetadata. These are controlled using ClusterAction
ACL, so it is easy to exclude and only throttle if unauthorized. I wasn't
sure if there are other requests used only for inter-broker that needed to
be excluded.

3. I was thinking the smallest change would be to replace all references to
*requestChannel.sendResponse()* with a local method
*sendResponseMaybeThrottle()* that does the throttling if any plus send
response. If we throttle first in *KafkaApis.handle()*, the time spent
within the method handling the request will not be recorded or used in
throttling. We can look into this again when the PR is ready for review.

Regards,

Rajini



On Wed, Feb 22, 2017 at 5:55 PM, Roger Hoover <ro...@gmail.com>
wrote:

> Great to see this KIP and the excellent discussion.
>
> To me, Jun's suggestion makes sense.  If my application is allocated 1
> request handler unit, then it's as if I have a Kafka broker with a single
> request handler thread dedicated to me.  That's the most I can use, at
> least.  That allocation doesn't change even if an admin later increases the
> size of the request thread pool on the broker.  It's similar to the CPU
> abstraction that VMs and containers get from hypervisors or OS schedulers.
> While different client access patterns can use wildly different amounts of
> request thread resources per request, a given application will generally
> have a stable access pattern and can figure out empirically how many
> "request thread units" it needs to meet it's throughput/latency goals.
>
> Cheers,
>
> Roger
>
> On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Rajini,
> >
> > Thanks for the updated KIP. A few more comments.
> >
> > 1. A concern of request_time_percent is that it's not an absolute value.
> > Let's say you give a user a 10% limit. If the admin doubles the number of
> > request handler threads, that user now actually has twice the absolute
> > capacity. This may confuse people a bit. So, perhaps setting the quota
> > based on an absolute request thread unit is better.
> >
> > 2. ControlledShutdownRequest is also an inter-broker request and needs to
> > be excluded from throttling.
> >
> > 3. Implementation wise, I am wondering if it's simpler to apply the
> request
> > time throttling first in KafkaApis.handle(). Otherwise, we will need to
> add
> > the throttling logic in each type of request.
> >
> > Thanks,
> >
> > Jun
> >
> > On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > wrote:
> >
> > > Jun,
> > >
> > > Thank you for the review.
> > >
> > > I have reverted to the original KIP that throttles based on request
> > handler
> > > utilization. At the moment, it uses percentage, but I am happy to
> change
> > to
> > > a fraction (out of 1 instead of 100) if required. I have added the
> > examples
> > > from this discussion to the KIP. Also added a "Future Work" section to
> > > address network thread utilization. The configuration is named
> > > "request_time_percent" with the expectation that it can also be used as
> > the
> > > limit for network thread utilization when that is implemented, so that
> > > users have to set only one config for the two and not have to worry
> about
> > > the internal distribution of the work between the two thread pools in
> > > Kafka.
> > >
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <ju...@confluent.io> wrote:
> > >
> > > > Hi, Rajini,
> > > >
> > > > Thanks for the proposal.
> > > >
> > > > The benefit of using the request processing time over the request
> rate
> > is
> > > > exactly what people have said. I will just expand that a bit.
> Consider
> > > the
> > > > following case. The producer sends a produce request with a 10MB
> > message
> > > > but compressed to 100KB with gzip. The decompression of the message
> on
> > > the
> > > > broker could take 10-15 seconds, during which time, a request handler
> > > > thread is completely blocked. In this case, neither the byte-in quota
> > nor
> > > > the request rate quota may be effective in protecting the broker.
> > > Consider
> > > > another case. A consumer group starts with 10 instances and later on
> > > > switches to 20 instances. The request rate will likely double, but
> the
> > > > actually load on the broker may not double since each fetch request
> > only
> > > > contains half of the partitions. Request rate quota may not be easy
> to
> > > > configure in this case.
> > > >
> > > > What we really want is to be able to prevent a client from using too
> > much
> > > > of the server side resources. In this particular KIP, this resource
> is
> > > the
> > > > capacity of the request handler threads. I agree that it may not be
> > > > intuitive for the users to determine how to set the right limit.
> > However,
> > > > this is not completely new and has been done in the container world
> > > > already. For example, Linux cgroup (https://access.redhat.com/
> > > > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > > > Resource_Management_Guide/sec-cpu.html) has the concept of
> > > > cpu.cfs_quota_us,
> > > > which specifies the total amount of time in microseconds for which
> all
> > > > tasks in a cgroup can run during a one second period. We can
> > potentially
> > > > model the request handler threads in a similar way. For example, each
> > > > request handler thread can be 1 request handler unit and the admin
> can
> > > > configure a limit on how many units (say 0.01) a client can have.
> > > >
> > > > Regarding not throttling the internal broker to broker requests. We
> > could
> > > > do that. Alternatively, we could just let the admin configure a high
> > > limit
> > > > for the kafka user (it may not be able to do that easily based on
> > > clientId
> > > > though).
> > > >
> > > > Ideally we want to be able to protect the utilization of the network
> > > thread
> > > > pool too. The difficult is mostly what Rajini said: (1) The mechanism
> > for
> > > > throttling the requests is through Purgatory and we will have to
> think
> > > > through how to integrate that into the network layer.  (2) In the
> > network
> > > > layer, currently we know the user, but not the clientId of the
> request.
> > > So,
> > > > it's a bit tricky to throttle based on clientId there. Plus, the
> > byteOut
> > > > quota can already protect the network thread utilization for fetch
> > > > requests. So, if we can't figure out this part right now, just
> focusing
> > > on
> > > > the request handling threads for this KIP is still a useful feature.
> > > >
> > > > Thanks,
> > > >
> > > > Jun
> > > >
> > > >
> > > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> > rajinisivaram@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Thank you all for the feedback.
> > > > >
> > > > > Jay: I have removed exemption for consumer heartbeat etc. Agree
> that
> > > > > protecting the cluster is more important than protecting individual
> > > apps.
> > > > > Have retained the exemption for StopReplicat/LeaderAndIsr etc,
> these
> > > are
> > > > > throttled only if authorization fails (so can't be used for DoS
> > attacks
> > > > in
> > > > > a secure cluster, but allows inter-broker requests to complete
> > without
> > > > > delays).
> > > > >
> > > > > I will wait another day to see if these is any objection to quotas
> > > based
> > > > on
> > > > > request processing time (as opposed to request rate) and if there
> are
> > > no
> > > > > objections, I will revert to the original proposal with some
> changes.
> > > > >
> > > > > The original proposal was only including the time used by the
> request
> > > > > handler threads (that made calculation easy). I think the
> suggestion
> > is
> > > > to
> > > > > include the time spent in the network threads as well since that
> may
> > be
> > > > > significant. As Jay pointed out, it is more complicated to
> calculate
> > > the
> > > > > total available CPU time and convert to a ratio when there *m* I/O
> > > > threads
> > > > > and *n* network threads. ThreadMXBean#getThreadCPUTime() may give
> us
> > > > what
> > > > > we want, but it can be very expensive on some platforms. As Becket
> > and
> > > > > Guozhang have pointed out, we do have several time measurements
> > already
> > > > for
> > > > > generating metrics that we could use, though we might want to
> switch
> > to
> > > > > nanoTime() instead of currentTimeMillis() since some of the values
> > for
> > > > > small requests may be < 1ms. But rather than add up the time spent
> in
> > > I/O
> > > > > thread and network thread, wouldn't it be better to convert the
> time
> > > > spent
> > > > > on each thread into a separate ratio? UserA has a request quota of
> > 5%.
> > > > Can
> > > > > we take that to mean that UserA can use 5% of the time on network
> > > threads
> > > > > and 5% of the time on I/O threads? If either is exceeded, the
> > response
> > > is
> > > > > throttled - it would mean maintaining two sets of metrics for the
> two
> > > > > durations, but would result in more meaningful ratios. We could
> > define
> > > > two
> > > > > quota limits (UserA has 5% of request threads and 10% of network
> > > > threads),
> > > > > but that seems unnecessary and harder to explain to users.
> > > > >
> > > > > Back to why and how quotas are applied to network thread
> utilization:
> > > > > a) In the case of fetch,  the time spent in the network thread may
> be
> > > > > significant and I can see the need to include this. Are there other
> > > > > requests where the network thread utilization is significant? In
> the
> > > case
> > > > > of fetch, request handler thread utilization would throttle clients
> > > with
> > > > > high request rate, low data volume and fetch byte rate quota will
> > > > throttle
> > > > > clients with high data volume. Network thread utilization is
> perhaps
> > > > > proportional to the data volume. I am wondering if we even need to
> > > > throttle
> > > > > based on network thread utilization or whether the data volume
> quota
> > > > covers
> > > > > this case.
> > > > >
> > > > > b) At the moment, we record and check for quota violation at the
> same
> > > > time.
> > > > > If a quota is violated, the response is delayed. Using Jay'e
> example
> > of
> > > > > disk reads for fetches happening in the network thread, We can't
> > record
> > > > and
> > > > > delay a response after the disk reads. We could record the time
> spent
> > > on
> > > > > the network thread when the response is complete and introduce a
> > delay
> > > > for
> > > > > handling a subsequent request (separate out recording and quota
> > > violation
> > > > > handling in the case of network thread overload). Does that make
> > sense?
> > > > >
> > > > >
> > > > > Regards,
> > > > >
> > > > > Rajini
> > > > >
> > > > >
> > > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <be...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > Hey Jay,
> > > > > >
> > > > > > Yeah, I agree that enforcing the CPU time is a little tricky. I
> am
> > > > > thinking
> > > > > > that maybe we can use the existing request statistics. They are
> > > already
> > > > > > very detailed so we can probably see the approximate CPU time
> from
> > > it,
> > > > > e.g.
> > > > > > something like (total_time - request/response_queue_time -
> > > > remote_time).
> > > > > >
> > > > > > I agree with Guozhang that when a user is throttled it is likely
> > that
> > > > we
> > > > > > need to see if anything has went wrong first, and if the users
> are
> > > well
> > > > > > behaving and just need more resources, we will have to bump up
> the
> > > > quota
> > > > > > for them. It is true that pre-allocating CPU time quota precisely
> > for
> > > > the
> > > > > > users is difficult. So in practice it would probably be more like
> > > first
> > > > > set
> > > > > > a relative high protective CPU time quota for everyone and
> increase
> > > > that
> > > > > > for some individual clients on demand.
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > >
> > > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <
> wangguoz@gmail.com
> > >
> > > > > wrote:
> > > > > >
> > > > > > > This is a great proposal, glad to see it happening.
> > > > > > >
> > > > > > > I am inclined to the CPU throttling, or more specifically
> > > processing
> > > > > time
> > > > > > > ratio instead of the request rate throttling as well. Becket
> has
> > > very
> > > > > > well
> > > > > > > summed my rationales above, and one thing to add here is that
> the
> > > > > former
> > > > > > > has a good support for both "protecting against rogue clients"
> as
> > > > well
> > > > > as
> > > > > > > "utilizing a cluster for multi-tenancy usage": when thinking
> > about
> > > > how
> > > > > to
> > > > > > > explain this to the end users, I find it actually more natural
> > than
> > > > the
> > > > > > > request rate since as mentioned above, different requests will
> > have
> > > > > quite
> > > > > > > different "cost", and Kafka today already have various request
> > > types
> > > > > > > (produce, fetch, admin, metadata, etc), because of that the
> > request
> > > > > rate
> > > > > > > throttling may not be as effective unless it is set very
> > > > > conservatively.
> > > > > > >
> > > > > > > Regarding to user reactions when they are throttled, I think it
> > may
> > > > > > differ
> > > > > > > case-by-case, and need to be discovered / guided by looking at
> > > > relative
> > > > > > > metrics. So in other words users would not expect to get
> > additional
> > > > > > > information by simply being told "hey, you are throttled",
> which
> > is
> > > > all
> > > > > > > what throttling does; they need to take a follow-up step and
> see
> > > > "hmm,
> > > > > > I'm
> > > > > > > throttled probably because of ..", which is by looking at other
> > > > metric
> > > > > > > values: e.g. whether I'm bombarding the brokers with metadata
> > > > request,
> > > > > > > which are usually cheap to handle but I'm sending thousands per
> > > > second;
> > > > > > or
> > > > > > > is it because I'm catching up and hence sending very heavy
> > fetching
> > > > > > request
> > > > > > > with large min.bytes, etc.
> > > > > > >
> > > > > > > Regarding to the implementation, as once discussed with Jun,
> this
> > > > seems
> > > > > > not
> > > > > > > very difficult since today we are already collecting the
> "thread
> > > pool
> > > > > > > utilization" metrics, which is a single percentage
> > > > "aggregateIdleMeter"
> > > > > > > value; but we are already effectively aggregating it for each
> > > > requests
> > > > > in
> > > > > > > KafkaRequestHandler, and we can just extend it by recording the
> > > > source
> > > > > > > client id when handling them and aggregating by clientId as
> well
> > as
> > > > the
> > > > > > > total aggregate.
> > > > > > >
> > > > > > >
> > > > > > > Guozhang
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <ja...@confluent.io>
> > > wrote:
> > > > > > >
> > > > > > > > Hey Becket/Rajini,
> > > > > > > >
> > > > > > > > When I thought about it more deeply I came around to the
> > "percent
> > > > of
> > > > > > > > processing time" metric too. It seems a lot closer to the
> thing
> > > we
> > > > > > > actually
> > > > > > > > care about and need to protect. I also think this would be a
> > very
> > > > > > useful
> > > > > > > > metric even in the absence of throttling just to debug whose
> > > using
> > > > > > > > capacity.
> > > > > > > >
> > > > > > > > Two problems to consider:
> > > > > > > >
> > > > > > > >    1. I agree that for the user it is understandable what
> lead
> > to
> > > > > their
> > > > > > > >    being throttled, but it is a bit hard to figure out the
> safe
> > > > range
> > > > > > for
> > > > > > > >    them. i.e. if I have a new app that will send 200
> > > messages/sec I
> > > > > can
> > > > > > > >    probably reason that I'll be under the throttling limit of
> > 300
> > > > > > > req/sec.
> > > > > > > >    However if I need to be under a 10% CPU resources limit it
> > may
> > > > be
> > > > > a
> > > > > > > bit
> > > > > > > >    harder for me to know a priori if i will or won't.
> > > > > > > >    2. Calculating the available CPU time is a bit difficult
> > since
> > > > > there
> > > > > > > are
> > > > > > > >    actually two thread pools--the I/O threads and the network
> > > > > threads.
> > > > > > I
> > > > > > > > think
> > > > > > > >    it might be workable to count just the I/O thread time as
> in
> > > the
> > > > > > > > proposal,
> > > > > > > >    but the network thread work is actually non-trivial (e.g.
> > all
> > > > the
> > > > > > disk
> > > > > > > >    reads for fetches happen in that thread). If you count
> both
> > > the
> > > > > > > network
> > > > > > > > and
> > > > > > > >    I/O threads it can skew things a bit. E.g. say you have 50
> > > > network
> > > > > > > > threads,
> > > > > > > >    10 I/O threads, and 8 cores, what is the available cpu
> time
> > > > > > available
> > > > > > > > in a
> > > > > > > >    second? I suppose this is a problem whenever you have a
> > > > bottleneck
> > > > > > > > between
> > > > > > > >    I/O and network threads or if you end up significantly
> > > > > > > over-provisioning
> > > > > > > >    one pool (both of which are hard to avoid).
> > > > > > > >
> > > > > > > > An alternative for CPU throttling would be to use this api:
> > > > > > > > http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/
> > > > > > > > management/ThreadMXBean.html#getThreadCpuTime(long)
> > > > > > > >
> > > > > > > > That would let you track actual CPU usage across the network,
> > I/O
> > > > > > > threads,
> > > > > > > > and purgatory threads and look at it as a percentage of total
> > > > cores.
> > > > > I
> > > > > > > > think this fixes many problems in the reliability of the
> > metric.
> > > > It's
> > > > > > > > meaning is slightly different as it is just CPU (you don't
> get
> > > > > charged
> > > > > > > for
> > > > > > > > time blocking on I/O) but that may be okay because we already
> > > have
> > > > a
> > > > > > > > throttle on I/O. The downside is I think it is possible this
> > api
> > > > can
> > > > > be
> > > > > > > > disabled or isn't always available and it may also be
> expensive
> > > > (also
> > > > > > > I've
> > > > > > > > never used it so not sure if it really works the way i
> think).
> > > > > > > >
> > > > > > > > -Jay
> > > > > > > >
> > > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> > > becket.qin@gmail.com>
> > > > > > > wrote:
> > > > > > > >
> > > > > > > > > If the purpose of the KIP is only to protect the cluster
> from
> > > > being
> > > > > > > > > overwhelmed by crazy clients and is not intended to address
> > > > > resource
> > > > > > > > > allocation problem among the clients, I am wondering if
> using
> > > > > request
> > > > > > > > > handling time quota (CPU time quota) is a better option.
> Here
> > > are
> > > > > the
> > > > > > > > > reasons:
> > > > > > > > >
> > > > > > > > > 1. request handling time quota has better protection. Say
> we
> > > have
> > > > > > > request
> > > > > > > > > rate quota and set that to some value like 100
> requests/sec,
> > it
> > > > is
> > > > > > > > possible
> > > > > > > > > that some of the requests are very expensive actually take
> a
> > > lot
> > > > of
> > > > > > > time
> > > > > > > > to
> > > > > > > > > handle. In that case a few clients may still occupy a lot
> of
> > > CPU
> > > > > time
> > > > > > > > even
> > > > > > > > > the request rate is low. Arguably we can carefully set
> > request
> > > > rate
> > > > > > > quota
> > > > > > > > > for each request and client id combination, but it could
> > still
> > > be
> > > > > > > tricky
> > > > > > > > to
> > > > > > > > > get it right for everyone.
> > > > > > > > >
> > > > > > > > > If we use the request time handling quota, we can simply
> say
> > no
> > > > > > clients
> > > > > > > > can
> > > > > > > > > take up to more than 30% of the total request handling
> > capacity
> > > > > > > (measured
> > > > > > > > > by time), regardless of the difference among different
> > requests
> > > > or
> > > > > > what
> > > > > > > > is
> > > > > > > > > the client doing. In this case maybe we can quota all the
> > > > requests
> > > > > if
> > > > > > > we
> > > > > > > > > want to.
> > > > > > > > >
> > > > > > > > > 2. The main benefit of using request rate limit is that it
> > > seems
> > > > > more
> > > > > > > > > intuitive. It is true that it is probably easier to explain
> > to
> > > > the
> > > > > > user
> > > > > > > > > what does that mean. However, in practice it looks the
> impact
> > > of
> > > > > > > request
> > > > > > > > > rate quota is not more quantifiable than the request
> handling
> > > > time
> > > > > > > quota.
> > > > > > > > > Unlike the byte rate quota, it is still difficult to give a
> > > > number
> > > > > > > about
> > > > > > > > > impact of throughput or latency when a request rate quota
> is
> > > hit.
> > > > > So
> > > > > > it
> > > > > > > > is
> > > > > > > > > not better than the request handling time quota. In fact I
> > feel
> > > > it
> > > > > is
> > > > > > > > > clearer to tell user that "you are limited because you have
> > > taken
> > > > > 30%
> > > > > > > of
> > > > > > > > > the CPU time on the broker" than otherwise something like
> > "your
> > > > > > request
> > > > > > > > > rate quota on metadata request has reached".
> > > > > > > > >
> > > > > > > > > Thanks,
> > > > > > > > >
> > > > > > > > > Jiangjie (Becket) Qin
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <
> jay@confluent.io
> > >
> > > > > wrote:
> > > > > > > > >
> > > > > > > > > > I think this proposal makes a lot of sense (especially
> now
> > > that
> > > > > it
> > > > > > is
> > > > > > > > > > oriented around request rate) and fills the biggest
> > remaining
> > > > gap
> > > > > > in
> > > > > > > > the
> > > > > > > > > > multi-tenancy story.
> > > > > > > > > >
> > > > > > > > > > I think for intra-cluster communication (StopReplica,
> etc)
> > we
> > > > > could
> > > > > > > > avoid
> > > > > > > > > > throttling entirely. You can secure or otherwise
> lock-down
> > > the
> > > > > > > cluster
> > > > > > > > > > communication to avoid any unauthorized external party
> from
> > > > > trying
> > > > > > to
> > > > > > > > > > initiate these requests. As a result we are as likely to
> > > cause
> > > > > > > problems
> > > > > > > > > as
> > > > > > > > > > solve them by throttling these, right?
> > > > > > > > > >
> > > > > > > > > > I'm not so sure that we should exempt the consumer
> requests
> > > > such
> > > > > as
> > > > > > > > > > heartbeat. It's true that if we throttle an app's
> heartbeat
> > > > > > requests
> > > > > > > it
> > > > > > > > > may
> > > > > > > > > > cause it to fall out of its consumer group. However if we
> > > don't
> > > > > > > > throttle
> > > > > > > > > it
> > > > > > > > > > it may DDOS the cluster if the heartbeat interval is set
> > > > > > incorrectly
> > > > > > > or
> > > > > > > > > if
> > > > > > > > > > some client in some language has a bug. I think the
> policy
> > > with
> > > > > > this
> > > > > > > > kind
> > > > > > > > > > of throttling is to protect the cluster above any
> > individual
> > > > app,
> > > > > > > > right?
> > > > > > > > > I
> > > > > > > > > > think in general this should be okay since for most
> > > deployments
> > > > > > this
> > > > > > > > > > setting is meant as more of a safety valve---that is
> rather
> > > > than
> > > > > > set
> > > > > > > > > > something very close to what you expect to need (say 2
> > > req/sec
> > > > or
> > > > > > > > > whatever)
> > > > > > > > > > you would have something quite high (like 100 req/sec)
> with
> > > > this
> > > > > > > meant
> > > > > > > > to
> > > > > > > > > > prevent a client gone crazy. I think when used this way
> > > > allowing
> > > > > > > those
> > > > > > > > to
> > > > > > > > > > be throttled would actually provide meaningful
> protection.
> > > > > > > > > >
> > > > > > > > > > -Jay
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > >
> > > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > > > > > rajinisivaram@gmail.com
> > > > > > > > > >
> > > > > > > > > > wrote:
> > > > > > > > > >
> > > > > > > > > > > Hi all,
> > > > > > > > > > >
> > > > > > > > > > > I have just created KIP-124 to introduce request rate
> > > quotas
> > > > to
> > > > > > > > Kafka:
> > > > > > > > > > >
> > > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > > > >
> > > > > > > > > > > The proposal is for a simple percentage request
> handling
> > > time
> > > > > > quota
> > > > > > > > > that
> > > > > > > > > > > can be allocated to *<client-id>*, *<user>* or *<user,
> > > > > > client-id>*.
> > > > > > > > > There
> > > > > > > > > > > are a few other suggestions also under "Rejected
> > > > alternatives".
> > > > > > > > > Feedback
> > > > > > > > > > > and suggestions are welcome.
> > > > > > > > > > >
> > > > > > > > > > > Thank you...
> > > > > > > > > > >
> > > > > > > > > > > Regards,
> > > > > > > > > > >
> > > > > > > > > > > Rajini
> > > > > > > > > > >
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > --
> > > > > > > -- Guozhang
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Roger Hoover <ro...@gmail.com>.
Great to see this KIP and the excellent discussion.

To me, Jun's suggestion makes sense.  If my application is allocated 1
request handler unit, then it's as if I have a Kafka broker with a single
request handler thread dedicated to me.  That's the most I can use, at
least.  That allocation doesn't change even if an admin later increases the
size of the request thread pool on the broker.  It's similar to the CPU
abstraction that VMs and containers get from hypervisors or OS schedulers.
While different client access patterns can use wildly different amounts of
request thread resources per request, a given application will generally
have a stable access pattern and can figure out empirically how many
"request thread units" it needs to meet it's throughput/latency goals.

Cheers,

Roger

On Wed, Feb 22, 2017 at 8:53 AM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Rajini,
>
> Thanks for the updated KIP. A few more comments.
>
> 1. A concern of request_time_percent is that it's not an absolute value.
> Let's say you give a user a 10% limit. If the admin doubles the number of
> request handler threads, that user now actually has twice the absolute
> capacity. This may confuse people a bit. So, perhaps setting the quota
> based on an absolute request thread unit is better.
>
> 2. ControlledShutdownRequest is also an inter-broker request and needs to
> be excluded from throttling.
>
> 3. Implementation wise, I am wondering if it's simpler to apply the request
> time throttling first in KafkaApis.handle(). Otherwise, we will need to add
> the throttling logic in each type of request.
>
> Thanks,
>
> Jun
>
> On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Jun,
> >
> > Thank you for the review.
> >
> > I have reverted to the original KIP that throttles based on request
> handler
> > utilization. At the moment, it uses percentage, but I am happy to change
> to
> > a fraction (out of 1 instead of 100) if required. I have added the
> examples
> > from this discussion to the KIP. Also added a "Future Work" section to
> > address network thread utilization. The configuration is named
> > "request_time_percent" with the expectation that it can also be used as
> the
> > limit for network thread utilization when that is implemented, so that
> > users have to set only one config for the two and not have to worry about
> > the internal distribution of the work between the two thread pools in
> > Kafka.
> >
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <ju...@confluent.io> wrote:
> >
> > > Hi, Rajini,
> > >
> > > Thanks for the proposal.
> > >
> > > The benefit of using the request processing time over the request rate
> is
> > > exactly what people have said. I will just expand that a bit. Consider
> > the
> > > following case. The producer sends a produce request with a 10MB
> message
> > > but compressed to 100KB with gzip. The decompression of the message on
> > the
> > > broker could take 10-15 seconds, during which time, a request handler
> > > thread is completely blocked. In this case, neither the byte-in quota
> nor
> > > the request rate quota may be effective in protecting the broker.
> > Consider
> > > another case. A consumer group starts with 10 instances and later on
> > > switches to 20 instances. The request rate will likely double, but the
> > > actually load on the broker may not double since each fetch request
> only
> > > contains half of the partitions. Request rate quota may not be easy to
> > > configure in this case.
> > >
> > > What we really want is to be able to prevent a client from using too
> much
> > > of the server side resources. In this particular KIP, this resource is
> > the
> > > capacity of the request handler threads. I agree that it may not be
> > > intuitive for the users to determine how to set the right limit.
> However,
> > > this is not completely new and has been done in the container world
> > > already. For example, Linux cgroup (https://access.redhat.com/
> > > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > > Resource_Management_Guide/sec-cpu.html) has the concept of
> > > cpu.cfs_quota_us,
> > > which specifies the total amount of time in microseconds for which all
> > > tasks in a cgroup can run during a one second period. We can
> potentially
> > > model the request handler threads in a similar way. For example, each
> > > request handler thread can be 1 request handler unit and the admin can
> > > configure a limit on how many units (say 0.01) a client can have.
> > >
> > > Regarding not throttling the internal broker to broker requests. We
> could
> > > do that. Alternatively, we could just let the admin configure a high
> > limit
> > > for the kafka user (it may not be able to do that easily based on
> > clientId
> > > though).
> > >
> > > Ideally we want to be able to protect the utilization of the network
> > thread
> > > pool too. The difficult is mostly what Rajini said: (1) The mechanism
> for
> > > throttling the requests is through Purgatory and we will have to think
> > > through how to integrate that into the network layer.  (2) In the
> network
> > > layer, currently we know the user, but not the clientId of the request.
> > So,
> > > it's a bit tricky to throttle based on clientId there. Plus, the
> byteOut
> > > quota can already protect the network thread utilization for fetch
> > > requests. So, if we can't figure out this part right now, just focusing
> > on
> > > the request handling threads for this KIP is still a useful feature.
> > >
> > > Thanks,
> > >
> > > Jun
> > >
> > >
> > > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <
> rajinisivaram@gmail.com
> > >
> > > wrote:
> > >
> > > > Thank you all for the feedback.
> > > >
> > > > Jay: I have removed exemption for consumer heartbeat etc. Agree that
> > > > protecting the cluster is more important than protecting individual
> > apps.
> > > > Have retained the exemption for StopReplicat/LeaderAndIsr etc, these
> > are
> > > > throttled only if authorization fails (so can't be used for DoS
> attacks
> > > in
> > > > a secure cluster, but allows inter-broker requests to complete
> without
> > > > delays).
> > > >
> > > > I will wait another day to see if these is any objection to quotas
> > based
> > > on
> > > > request processing time (as opposed to request rate) and if there are
> > no
> > > > objections, I will revert to the original proposal with some changes.
> > > >
> > > > The original proposal was only including the time used by the request
> > > > handler threads (that made calculation easy). I think the suggestion
> is
> > > to
> > > > include the time spent in the network threads as well since that may
> be
> > > > significant. As Jay pointed out, it is more complicated to calculate
> > the
> > > > total available CPU time and convert to a ratio when there *m* I/O
> > > threads
> > > > and *n* network threads. ThreadMXBean#getThreadCPUTime() may give us
> > > what
> > > > we want, but it can be very expensive on some platforms. As Becket
> and
> > > > Guozhang have pointed out, we do have several time measurements
> already
> > > for
> > > > generating metrics that we could use, though we might want to switch
> to
> > > > nanoTime() instead of currentTimeMillis() since some of the values
> for
> > > > small requests may be < 1ms. But rather than add up the time spent in
> > I/O
> > > > thread and network thread, wouldn't it be better to convert the time
> > > spent
> > > > on each thread into a separate ratio? UserA has a request quota of
> 5%.
> > > Can
> > > > we take that to mean that UserA can use 5% of the time on network
> > threads
> > > > and 5% of the time on I/O threads? If either is exceeded, the
> response
> > is
> > > > throttled - it would mean maintaining two sets of metrics for the two
> > > > durations, but would result in more meaningful ratios. We could
> define
> > > two
> > > > quota limits (UserA has 5% of request threads and 10% of network
> > > threads),
> > > > but that seems unnecessary and harder to explain to users.
> > > >
> > > > Back to why and how quotas are applied to network thread utilization:
> > > > a) In the case of fetch,  the time spent in the network thread may be
> > > > significant and I can see the need to include this. Are there other
> > > > requests where the network thread utilization is significant? In the
> > case
> > > > of fetch, request handler thread utilization would throttle clients
> > with
> > > > high request rate, low data volume and fetch byte rate quota will
> > > throttle
> > > > clients with high data volume. Network thread utilization is perhaps
> > > > proportional to the data volume. I am wondering if we even need to
> > > throttle
> > > > based on network thread utilization or whether the data volume quota
> > > covers
> > > > this case.
> > > >
> > > > b) At the moment, we record and check for quota violation at the same
> > > time.
> > > > If a quota is violated, the response is delayed. Using Jay'e example
> of
> > > > disk reads for fetches happening in the network thread, We can't
> record
> > > and
> > > > delay a response after the disk reads. We could record the time spent
> > on
> > > > the network thread when the response is complete and introduce a
> delay
> > > for
> > > > handling a subsequent request (separate out recording and quota
> > violation
> > > > handling in the case of network thread overload). Does that make
> sense?
> > > >
> > > >
> > > > Regards,
> > > >
> > > > Rajini
> > > >
> > > >
> > > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <be...@gmail.com>
> > > wrote:
> > > >
> > > > > Hey Jay,
> > > > >
> > > > > Yeah, I agree that enforcing the CPU time is a little tricky. I am
> > > > thinking
> > > > > that maybe we can use the existing request statistics. They are
> > already
> > > > > very detailed so we can probably see the approximate CPU time from
> > it,
> > > > e.g.
> > > > > something like (total_time - request/response_queue_time -
> > > remote_time).
> > > > >
> > > > > I agree with Guozhang that when a user is throttled it is likely
> that
> > > we
> > > > > need to see if anything has went wrong first, and if the users are
> > well
> > > > > behaving and just need more resources, we will have to bump up the
> > > quota
> > > > > for them. It is true that pre-allocating CPU time quota precisely
> for
> > > the
> > > > > users is difficult. So in practice it would probably be more like
> > first
> > > > set
> > > > > a relative high protective CPU time quota for everyone and increase
> > > that
> > > > > for some individual clients on demand.
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > >
> > > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <wangguoz@gmail.com
> >
> > > > wrote:
> > > > >
> > > > > > This is a great proposal, glad to see it happening.
> > > > > >
> > > > > > I am inclined to the CPU throttling, or more specifically
> > processing
> > > > time
> > > > > > ratio instead of the request rate throttling as well. Becket has
> > very
> > > > > well
> > > > > > summed my rationales above, and one thing to add here is that the
> > > > former
> > > > > > has a good support for both "protecting against rogue clients" as
> > > well
> > > > as
> > > > > > "utilizing a cluster for multi-tenancy usage": when thinking
> about
> > > how
> > > > to
> > > > > > explain this to the end users, I find it actually more natural
> than
> > > the
> > > > > > request rate since as mentioned above, different requests will
> have
> > > > quite
> > > > > > different "cost", and Kafka today already have various request
> > types
> > > > > > (produce, fetch, admin, metadata, etc), because of that the
> request
> > > > rate
> > > > > > throttling may not be as effective unless it is set very
> > > > conservatively.
> > > > > >
> > > > > > Regarding to user reactions when they are throttled, I think it
> may
> > > > > differ
> > > > > > case-by-case, and need to be discovered / guided by looking at
> > > relative
> > > > > > metrics. So in other words users would not expect to get
> additional
> > > > > > information by simply being told "hey, you are throttled", which
> is
> > > all
> > > > > > what throttling does; they need to take a follow-up step and see
> > > "hmm,
> > > > > I'm
> > > > > > throttled probably because of ..", which is by looking at other
> > > metric
> > > > > > values: e.g. whether I'm bombarding the brokers with metadata
> > > request,
> > > > > > which are usually cheap to handle but I'm sending thousands per
> > > second;
> > > > > or
> > > > > > is it because I'm catching up and hence sending very heavy
> fetching
> > > > > request
> > > > > > with large min.bytes, etc.
> > > > > >
> > > > > > Regarding to the implementation, as once discussed with Jun, this
> > > seems
> > > > > not
> > > > > > very difficult since today we are already collecting the "thread
> > pool
> > > > > > utilization" metrics, which is a single percentage
> > > "aggregateIdleMeter"
> > > > > > value; but we are already effectively aggregating it for each
> > > requests
> > > > in
> > > > > > KafkaRequestHandler, and we can just extend it by recording the
> > > source
> > > > > > client id when handling them and aggregating by clientId as well
> as
> > > the
> > > > > > total aggregate.
> > > > > >
> > > > > >
> > > > > > Guozhang
> > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <ja...@confluent.io>
> > wrote:
> > > > > >
> > > > > > > Hey Becket/Rajini,
> > > > > > >
> > > > > > > When I thought about it more deeply I came around to the
> "percent
> > > of
> > > > > > > processing time" metric too. It seems a lot closer to the thing
> > we
> > > > > > actually
> > > > > > > care about and need to protect. I also think this would be a
> very
> > > > > useful
> > > > > > > metric even in the absence of throttling just to debug whose
> > using
> > > > > > > capacity.
> > > > > > >
> > > > > > > Two problems to consider:
> > > > > > >
> > > > > > >    1. I agree that for the user it is understandable what lead
> to
> > > > their
> > > > > > >    being throttled, but it is a bit hard to figure out the safe
> > > range
> > > > > for
> > > > > > >    them. i.e. if I have a new app that will send 200
> > messages/sec I
> > > > can
> > > > > > >    probably reason that I'll be under the throttling limit of
> 300
> > > > > > req/sec.
> > > > > > >    However if I need to be under a 10% CPU resources limit it
> may
> > > be
> > > > a
> > > > > > bit
> > > > > > >    harder for me to know a priori if i will or won't.
> > > > > > >    2. Calculating the available CPU time is a bit difficult
> since
> > > > there
> > > > > > are
> > > > > > >    actually two thread pools--the I/O threads and the network
> > > > threads.
> > > > > I
> > > > > > > think
> > > > > > >    it might be workable to count just the I/O thread time as in
> > the
> > > > > > > proposal,
> > > > > > >    but the network thread work is actually non-trivial (e.g.
> all
> > > the
> > > > > disk
> > > > > > >    reads for fetches happen in that thread). If you count both
> > the
> > > > > > network
> > > > > > > and
> > > > > > >    I/O threads it can skew things a bit. E.g. say you have 50
> > > network
> > > > > > > threads,
> > > > > > >    10 I/O threads, and 8 cores, what is the available cpu time
> > > > > available
> > > > > > > in a
> > > > > > >    second? I suppose this is a problem whenever you have a
> > > bottleneck
> > > > > > > between
> > > > > > >    I/O and network threads or if you end up significantly
> > > > > > over-provisioning
> > > > > > >    one pool (both of which are hard to avoid).
> > > > > > >
> > > > > > > An alternative for CPU throttling would be to use this api:
> > > > > > > http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/
> > > > > > > management/ThreadMXBean.html#getThreadCpuTime(long)
> > > > > > >
> > > > > > > That would let you track actual CPU usage across the network,
> I/O
> > > > > > threads,
> > > > > > > and purgatory threads and look at it as a percentage of total
> > > cores.
> > > > I
> > > > > > > think this fixes many problems in the reliability of the
> metric.
> > > It's
> > > > > > > meaning is slightly different as it is just CPU (you don't get
> > > > charged
> > > > > > for
> > > > > > > time blocking on I/O) but that may be okay because we already
> > have
> > > a
> > > > > > > throttle on I/O. The downside is I think it is possible this
> api
> > > can
> > > > be
> > > > > > > disabled or isn't always available and it may also be expensive
> > > (also
> > > > > > I've
> > > > > > > never used it so not sure if it really works the way i think).
> > > > > > >
> > > > > > > -Jay
> > > > > > >
> > > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> > becket.qin@gmail.com>
> > > > > > wrote:
> > > > > > >
> > > > > > > > If the purpose of the KIP is only to protect the cluster from
> > > being
> > > > > > > > overwhelmed by crazy clients and is not intended to address
> > > > resource
> > > > > > > > allocation problem among the clients, I am wondering if using
> > > > request
> > > > > > > > handling time quota (CPU time quota) is a better option. Here
> > are
> > > > the
> > > > > > > > reasons:
> > > > > > > >
> > > > > > > > 1. request handling time quota has better protection. Say we
> > have
> > > > > > request
> > > > > > > > rate quota and set that to some value like 100 requests/sec,
> it
> > > is
> > > > > > > possible
> > > > > > > > that some of the requests are very expensive actually take a
> > lot
> > > of
> > > > > > time
> > > > > > > to
> > > > > > > > handle. In that case a few clients may still occupy a lot of
> > CPU
> > > > time
> > > > > > > even
> > > > > > > > the request rate is low. Arguably we can carefully set
> request
> > > rate
> > > > > > quota
> > > > > > > > for each request and client id combination, but it could
> still
> > be
> > > > > > tricky
> > > > > > > to
> > > > > > > > get it right for everyone.
> > > > > > > >
> > > > > > > > If we use the request time handling quota, we can simply say
> no
> > > > > clients
> > > > > > > can
> > > > > > > > take up to more than 30% of the total request handling
> capacity
> > > > > > (measured
> > > > > > > > by time), regardless of the difference among different
> requests
> > > or
> > > > > what
> > > > > > > is
> > > > > > > > the client doing. In this case maybe we can quota all the
> > > requests
> > > > if
> > > > > > we
> > > > > > > > want to.
> > > > > > > >
> > > > > > > > 2. The main benefit of using request rate limit is that it
> > seems
> > > > more
> > > > > > > > intuitive. It is true that it is probably easier to explain
> to
> > > the
> > > > > user
> > > > > > > > what does that mean. However, in practice it looks the impact
> > of
> > > > > > request
> > > > > > > > rate quota is not more quantifiable than the request handling
> > > time
> > > > > > quota.
> > > > > > > > Unlike the byte rate quota, it is still difficult to give a
> > > number
> > > > > > about
> > > > > > > > impact of throughput or latency when a request rate quota is
> > hit.
> > > > So
> > > > > it
> > > > > > > is
> > > > > > > > not better than the request handling time quota. In fact I
> feel
> > > it
> > > > is
> > > > > > > > clearer to tell user that "you are limited because you have
> > taken
> > > > 30%
> > > > > > of
> > > > > > > > the CPU time on the broker" than otherwise something like
> "your
> > > > > request
> > > > > > > > rate quota on metadata request has reached".
> > > > > > > >
> > > > > > > > Thanks,
> > > > > > > >
> > > > > > > > Jiangjie (Becket) Qin
> > > > > > > >
> > > > > > > >
> > > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <jay@confluent.io
> >
> > > > wrote:
> > > > > > > >
> > > > > > > > > I think this proposal makes a lot of sense (especially now
> > that
> > > > it
> > > > > is
> > > > > > > > > oriented around request rate) and fills the biggest
> remaining
> > > gap
> > > > > in
> > > > > > > the
> > > > > > > > > multi-tenancy story.
> > > > > > > > >
> > > > > > > > > I think for intra-cluster communication (StopReplica, etc)
> we
> > > > could
> > > > > > > avoid
> > > > > > > > > throttling entirely. You can secure or otherwise lock-down
> > the
> > > > > > cluster
> > > > > > > > > communication to avoid any unauthorized external party from
> > > > trying
> > > > > to
> > > > > > > > > initiate these requests. As a result we are as likely to
> > cause
> > > > > > problems
> > > > > > > > as
> > > > > > > > > solve them by throttling these, right?
> > > > > > > > >
> > > > > > > > > I'm not so sure that we should exempt the consumer requests
> > > such
> > > > as
> > > > > > > > > heartbeat. It's true that if we throttle an app's heartbeat
> > > > > requests
> > > > > > it
> > > > > > > > may
> > > > > > > > > cause it to fall out of its consumer group. However if we
> > don't
> > > > > > > throttle
> > > > > > > > it
> > > > > > > > > it may DDOS the cluster if the heartbeat interval is set
> > > > > incorrectly
> > > > > > or
> > > > > > > > if
> > > > > > > > > some client in some language has a bug. I think the policy
> > with
> > > > > this
> > > > > > > kind
> > > > > > > > > of throttling is to protect the cluster above any
> individual
> > > app,
> > > > > > > right?
> > > > > > > > I
> > > > > > > > > think in general this should be okay since for most
> > deployments
> > > > > this
> > > > > > > > > setting is meant as more of a safety valve---that is rather
> > > than
> > > > > set
> > > > > > > > > something very close to what you expect to need (say 2
> > req/sec
> > > or
> > > > > > > > whatever)
> > > > > > > > > you would have something quite high (like 100 req/sec) with
> > > this
> > > > > > meant
> > > > > > > to
> > > > > > > > > prevent a client gone crazy. I think when used this way
> > > allowing
> > > > > > those
> > > > > > > to
> > > > > > > > > be throttled would actually provide meaningful protection.
> > > > > > > > >
> > > > > > > > > -Jay
> > > > > > > > >
> > > > > > > > >
> > > > > > > > >
> > > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > > > > rajinisivaram@gmail.com
> > > > > > > > >
> > > > > > > > > wrote:
> > > > > > > > >
> > > > > > > > > > Hi all,
> > > > > > > > > >
> > > > > > > > > > I have just created KIP-124 to introduce request rate
> > quotas
> > > to
> > > > > > > Kafka:
> > > > > > > > > >
> > > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > > >
> > > > > > > > > > The proposal is for a simple percentage request handling
> > time
> > > > > quota
> > > > > > > > that
> > > > > > > > > > can be allocated to *<client-id>*, *<user>* or *<user,
> > > > > client-id>*.
> > > > > > > > There
> > > > > > > > > > are a few other suggestions also under "Rejected
> > > alternatives".
> > > > > > > > Feedback
> > > > > > > > > > and suggestions are welcome.
> > > > > > > > > >
> > > > > > > > > > Thank you...
> > > > > > > > > >
> > > > > > > > > > Regards,
> > > > > > > > > >
> > > > > > > > > > Rajini
> > > > > > > > > >
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > > >
> > > > > >
> > > > > > --
> > > > > > -- Guozhang
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Rajini,

Thanks for the updated KIP. A few more comments.

1. A concern of request_time_percent is that it's not an absolute value.
Let's say you give a user a 10% limit. If the admin doubles the number of
request handler threads, that user now actually has twice the absolute
capacity. This may confuse people a bit. So, perhaps setting the quota
based on an absolute request thread unit is better.

2. ControlledShutdownRequest is also an inter-broker request and needs to
be excluded from throttling.

3. Implementation wise, I am wondering if it's simpler to apply the request
time throttling first in KafkaApis.handle(). Otherwise, we will need to add
the throttling logic in each type of request.

Thanks,

Jun

On Wed, Feb 22, 2017 at 5:58 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Jun,
>
> Thank you for the review.
>
> I have reverted to the original KIP that throttles based on request handler
> utilization. At the moment, it uses percentage, but I am happy to change to
> a fraction (out of 1 instead of 100) if required. I have added the examples
> from this discussion to the KIP. Also added a "Future Work" section to
> address network thread utilization. The configuration is named
> "request_time_percent" with the expectation that it can also be used as the
> limit for network thread utilization when that is implemented, so that
> users have to set only one config for the two and not have to worry about
> the internal distribution of the work between the two thread pools in
> Kafka.
>
>
> Regards,
>
> Rajini
>
>
> On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <ju...@confluent.io> wrote:
>
> > Hi, Rajini,
> >
> > Thanks for the proposal.
> >
> > The benefit of using the request processing time over the request rate is
> > exactly what people have said. I will just expand that a bit. Consider
> the
> > following case. The producer sends a produce request with a 10MB message
> > but compressed to 100KB with gzip. The decompression of the message on
> the
> > broker could take 10-15 seconds, during which time, a request handler
> > thread is completely blocked. In this case, neither the byte-in quota nor
> > the request rate quota may be effective in protecting the broker.
> Consider
> > another case. A consumer group starts with 10 instances and later on
> > switches to 20 instances. The request rate will likely double, but the
> > actually load on the broker may not double since each fetch request only
> > contains half of the partitions. Request rate quota may not be easy to
> > configure in this case.
> >
> > What we really want is to be able to prevent a client from using too much
> > of the server side resources. In this particular KIP, this resource is
> the
> > capacity of the request handler threads. I agree that it may not be
> > intuitive for the users to determine how to set the right limit. However,
> > this is not completely new and has been done in the container world
> > already. For example, Linux cgroup (https://access.redhat.com/
> > documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> > Resource_Management_Guide/sec-cpu.html) has the concept of
> > cpu.cfs_quota_us,
> > which specifies the total amount of time in microseconds for which all
> > tasks in a cgroup can run during a one second period. We can potentially
> > model the request handler threads in a similar way. For example, each
> > request handler thread can be 1 request handler unit and the admin can
> > configure a limit on how many units (say 0.01) a client can have.
> >
> > Regarding not throttling the internal broker to broker requests. We could
> > do that. Alternatively, we could just let the admin configure a high
> limit
> > for the kafka user (it may not be able to do that easily based on
> clientId
> > though).
> >
> > Ideally we want to be able to protect the utilization of the network
> thread
> > pool too. The difficult is mostly what Rajini said: (1) The mechanism for
> > throttling the requests is through Purgatory and we will have to think
> > through how to integrate that into the network layer.  (2) In the network
> > layer, currently we know the user, but not the clientId of the request.
> So,
> > it's a bit tricky to throttle based on clientId there. Plus, the byteOut
> > quota can already protect the network thread utilization for fetch
> > requests. So, if we can't figure out this part right now, just focusing
> on
> > the request handling threads for this KIP is still a useful feature.
> >
> > Thanks,
> >
> > Jun
> >
> >
> > On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > wrote:
> >
> > > Thank you all for the feedback.
> > >
> > > Jay: I have removed exemption for consumer heartbeat etc. Agree that
> > > protecting the cluster is more important than protecting individual
> apps.
> > > Have retained the exemption for StopReplicat/LeaderAndIsr etc, these
> are
> > > throttled only if authorization fails (so can't be used for DoS attacks
> > in
> > > a secure cluster, but allows inter-broker requests to complete without
> > > delays).
> > >
> > > I will wait another day to see if these is any objection to quotas
> based
> > on
> > > request processing time (as opposed to request rate) and if there are
> no
> > > objections, I will revert to the original proposal with some changes.
> > >
> > > The original proposal was only including the time used by the request
> > > handler threads (that made calculation easy). I think the suggestion is
> > to
> > > include the time spent in the network threads as well since that may be
> > > significant. As Jay pointed out, it is more complicated to calculate
> the
> > > total available CPU time and convert to a ratio when there *m* I/O
> > threads
> > > and *n* network threads. ThreadMXBean#getThreadCPUTime() may give us
> > what
> > > we want, but it can be very expensive on some platforms. As Becket and
> > > Guozhang have pointed out, we do have several time measurements already
> > for
> > > generating metrics that we could use, though we might want to switch to
> > > nanoTime() instead of currentTimeMillis() since some of the values for
> > > small requests may be < 1ms. But rather than add up the time spent in
> I/O
> > > thread and network thread, wouldn't it be better to convert the time
> > spent
> > > on each thread into a separate ratio? UserA has a request quota of 5%.
> > Can
> > > we take that to mean that UserA can use 5% of the time on network
> threads
> > > and 5% of the time on I/O threads? If either is exceeded, the response
> is
> > > throttled - it would mean maintaining two sets of metrics for the two
> > > durations, but would result in more meaningful ratios. We could define
> > two
> > > quota limits (UserA has 5% of request threads and 10% of network
> > threads),
> > > but that seems unnecessary and harder to explain to users.
> > >
> > > Back to why and how quotas are applied to network thread utilization:
> > > a) In the case of fetch,  the time spent in the network thread may be
> > > significant and I can see the need to include this. Are there other
> > > requests where the network thread utilization is significant? In the
> case
> > > of fetch, request handler thread utilization would throttle clients
> with
> > > high request rate, low data volume and fetch byte rate quota will
> > throttle
> > > clients with high data volume. Network thread utilization is perhaps
> > > proportional to the data volume. I am wondering if we even need to
> > throttle
> > > based on network thread utilization or whether the data volume quota
> > covers
> > > this case.
> > >
> > > b) At the moment, we record and check for quota violation at the same
> > time.
> > > If a quota is violated, the response is delayed. Using Jay'e example of
> > > disk reads for fetches happening in the network thread, We can't record
> > and
> > > delay a response after the disk reads. We could record the time spent
> on
> > > the network thread when the response is complete and introduce a delay
> > for
> > > handling a subsequent request (separate out recording and quota
> violation
> > > handling in the case of network thread overload). Does that make sense?
> > >
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> > >
> > > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <be...@gmail.com>
> > wrote:
> > >
> > > > Hey Jay,
> > > >
> > > > Yeah, I agree that enforcing the CPU time is a little tricky. I am
> > > thinking
> > > > that maybe we can use the existing request statistics. They are
> already
> > > > very detailed so we can probably see the approximate CPU time from
> it,
> > > e.g.
> > > > something like (total_time - request/response_queue_time -
> > remote_time).
> > > >
> > > > I agree with Guozhang that when a user is throttled it is likely that
> > we
> > > > need to see if anything has went wrong first, and if the users are
> well
> > > > behaving and just need more resources, we will have to bump up the
> > quota
> > > > for them. It is true that pre-allocating CPU time quota precisely for
> > the
> > > > users is difficult. So in practice it would probably be more like
> first
> > > set
> > > > a relative high protective CPU time quota for everyone and increase
> > that
> > > > for some individual clients on demand.
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > >
> > > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <wa...@gmail.com>
> > > wrote:
> > > >
> > > > > This is a great proposal, glad to see it happening.
> > > > >
> > > > > I am inclined to the CPU throttling, or more specifically
> processing
> > > time
> > > > > ratio instead of the request rate throttling as well. Becket has
> very
> > > > well
> > > > > summed my rationales above, and one thing to add here is that the
> > > former
> > > > > has a good support for both "protecting against rogue clients" as
> > well
> > > as
> > > > > "utilizing a cluster for multi-tenancy usage": when thinking about
> > how
> > > to
> > > > > explain this to the end users, I find it actually more natural than
> > the
> > > > > request rate since as mentioned above, different requests will have
> > > quite
> > > > > different "cost", and Kafka today already have various request
> types
> > > > > (produce, fetch, admin, metadata, etc), because of that the request
> > > rate
> > > > > throttling may not be as effective unless it is set very
> > > conservatively.
> > > > >
> > > > > Regarding to user reactions when they are throttled, I think it may
> > > > differ
> > > > > case-by-case, and need to be discovered / guided by looking at
> > relative
> > > > > metrics. So in other words users would not expect to get additional
> > > > > information by simply being told "hey, you are throttled", which is
> > all
> > > > > what throttling does; they need to take a follow-up step and see
> > "hmm,
> > > > I'm
> > > > > throttled probably because of ..", which is by looking at other
> > metric
> > > > > values: e.g. whether I'm bombarding the brokers with metadata
> > request,
> > > > > which are usually cheap to handle but I'm sending thousands per
> > second;
> > > > or
> > > > > is it because I'm catching up and hence sending very heavy fetching
> > > > request
> > > > > with large min.bytes, etc.
> > > > >
> > > > > Regarding to the implementation, as once discussed with Jun, this
> > seems
> > > > not
> > > > > very difficult since today we are already collecting the "thread
> pool
> > > > > utilization" metrics, which is a single percentage
> > "aggregateIdleMeter"
> > > > > value; but we are already effectively aggregating it for each
> > requests
> > > in
> > > > > KafkaRequestHandler, and we can just extend it by recording the
> > source
> > > > > client id when handling them and aggregating by clientId as well as
> > the
> > > > > total aggregate.
> > > > >
> > > > >
> > > > > Guozhang
> > > > >
> > > > >
> > > > >
> > > > >
> > > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <ja...@confluent.io>
> wrote:
> > > > >
> > > > > > Hey Becket/Rajini,
> > > > > >
> > > > > > When I thought about it more deeply I came around to the "percent
> > of
> > > > > > processing time" metric too. It seems a lot closer to the thing
> we
> > > > > actually
> > > > > > care about and need to protect. I also think this would be a very
> > > > useful
> > > > > > metric even in the absence of throttling just to debug whose
> using
> > > > > > capacity.
> > > > > >
> > > > > > Two problems to consider:
> > > > > >
> > > > > >    1. I agree that for the user it is understandable what lead to
> > > their
> > > > > >    being throttled, but it is a bit hard to figure out the safe
> > range
> > > > for
> > > > > >    them. i.e. if I have a new app that will send 200
> messages/sec I
> > > can
> > > > > >    probably reason that I'll be under the throttling limit of 300
> > > > > req/sec.
> > > > > >    However if I need to be under a 10% CPU resources limit it may
> > be
> > > a
> > > > > bit
> > > > > >    harder for me to know a priori if i will or won't.
> > > > > >    2. Calculating the available CPU time is a bit difficult since
> > > there
> > > > > are
> > > > > >    actually two thread pools--the I/O threads and the network
> > > threads.
> > > > I
> > > > > > think
> > > > > >    it might be workable to count just the I/O thread time as in
> the
> > > > > > proposal,
> > > > > >    but the network thread work is actually non-trivial (e.g. all
> > the
> > > > disk
> > > > > >    reads for fetches happen in that thread). If you count both
> the
> > > > > network
> > > > > > and
> > > > > >    I/O threads it can skew things a bit. E.g. say you have 50
> > network
> > > > > > threads,
> > > > > >    10 I/O threads, and 8 cores, what is the available cpu time
> > > > available
> > > > > > in a
> > > > > >    second? I suppose this is a problem whenever you have a
> > bottleneck
> > > > > > between
> > > > > >    I/O and network threads or if you end up significantly
> > > > > over-provisioning
> > > > > >    one pool (both of which are hard to avoid).
> > > > > >
> > > > > > An alternative for CPU throttling would be to use this api:
> > > > > > http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/
> > > > > > management/ThreadMXBean.html#getThreadCpuTime(long)
> > > > > >
> > > > > > That would let you track actual CPU usage across the network, I/O
> > > > > threads,
> > > > > > and purgatory threads and look at it as a percentage of total
> > cores.
> > > I
> > > > > > think this fixes many problems in the reliability of the metric.
> > It's
> > > > > > meaning is slightly different as it is just CPU (you don't get
> > > charged
> > > > > for
> > > > > > time blocking on I/O) but that may be okay because we already
> have
> > a
> > > > > > throttle on I/O. The downside is I think it is possible this api
> > can
> > > be
> > > > > > disabled or isn't always available and it may also be expensive
> > (also
> > > > > I've
> > > > > > never used it so not sure if it really works the way i think).
> > > > > >
> > > > > > -Jay
> > > > > >
> > > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <
> becket.qin@gmail.com>
> > > > > wrote:
> > > > > >
> > > > > > > If the purpose of the KIP is only to protect the cluster from
> > being
> > > > > > > overwhelmed by crazy clients and is not intended to address
> > > resource
> > > > > > > allocation problem among the clients, I am wondering if using
> > > request
> > > > > > > handling time quota (CPU time quota) is a better option. Here
> are
> > > the
> > > > > > > reasons:
> > > > > > >
> > > > > > > 1. request handling time quota has better protection. Say we
> have
> > > > > request
> > > > > > > rate quota and set that to some value like 100 requests/sec, it
> > is
> > > > > > possible
> > > > > > > that some of the requests are very expensive actually take a
> lot
> > of
> > > > > time
> > > > > > to
> > > > > > > handle. In that case a few clients may still occupy a lot of
> CPU
> > > time
> > > > > > even
> > > > > > > the request rate is low. Arguably we can carefully set request
> > rate
> > > > > quota
> > > > > > > for each request and client id combination, but it could still
> be
> > > > > tricky
> > > > > > to
> > > > > > > get it right for everyone.
> > > > > > >
> > > > > > > If we use the request time handling quota, we can simply say no
> > > > clients
> > > > > > can
> > > > > > > take up to more than 30% of the total request handling capacity
> > > > > (measured
> > > > > > > by time), regardless of the difference among different requests
> > or
> > > > what
> > > > > > is
> > > > > > > the client doing. In this case maybe we can quota all the
> > requests
> > > if
> > > > > we
> > > > > > > want to.
> > > > > > >
> > > > > > > 2. The main benefit of using request rate limit is that it
> seems
> > > more
> > > > > > > intuitive. It is true that it is probably easier to explain to
> > the
> > > > user
> > > > > > > what does that mean. However, in practice it looks the impact
> of
> > > > > request
> > > > > > > rate quota is not more quantifiable than the request handling
> > time
> > > > > quota.
> > > > > > > Unlike the byte rate quota, it is still difficult to give a
> > number
> > > > > about
> > > > > > > impact of throughput or latency when a request rate quota is
> hit.
> > > So
> > > > it
> > > > > > is
> > > > > > > not better than the request handling time quota. In fact I feel
> > it
> > > is
> > > > > > > clearer to tell user that "you are limited because you have
> taken
> > > 30%
> > > > > of
> > > > > > > the CPU time on the broker" than otherwise something like "your
> > > > request
> > > > > > > rate quota on metadata request has reached".
> > > > > > >
> > > > > > > Thanks,
> > > > > > >
> > > > > > > Jiangjie (Becket) Qin
> > > > > > >
> > > > > > >
> > > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <ja...@confluent.io>
> > > wrote:
> > > > > > >
> > > > > > > > I think this proposal makes a lot of sense (especially now
> that
> > > it
> > > > is
> > > > > > > > oriented around request rate) and fills the biggest remaining
> > gap
> > > > in
> > > > > > the
> > > > > > > > multi-tenancy story.
> > > > > > > >
> > > > > > > > I think for intra-cluster communication (StopReplica, etc) we
> > > could
> > > > > > avoid
> > > > > > > > throttling entirely. You can secure or otherwise lock-down
> the
> > > > > cluster
> > > > > > > > communication to avoid any unauthorized external party from
> > > trying
> > > > to
> > > > > > > > initiate these requests. As a result we are as likely to
> cause
> > > > > problems
> > > > > > > as
> > > > > > > > solve them by throttling these, right?
> > > > > > > >
> > > > > > > > I'm not so sure that we should exempt the consumer requests
> > such
> > > as
> > > > > > > > heartbeat. It's true that if we throttle an app's heartbeat
> > > > requests
> > > > > it
> > > > > > > may
> > > > > > > > cause it to fall out of its consumer group. However if we
> don't
> > > > > > throttle
> > > > > > > it
> > > > > > > > it may DDOS the cluster if the heartbeat interval is set
> > > > incorrectly
> > > > > or
> > > > > > > if
> > > > > > > > some client in some language has a bug. I think the policy
> with
> > > > this
> > > > > > kind
> > > > > > > > of throttling is to protect the cluster above any individual
> > app,
> > > > > > right?
> > > > > > > I
> > > > > > > > think in general this should be okay since for most
> deployments
> > > > this
> > > > > > > > setting is meant as more of a safety valve---that is rather
> > than
> > > > set
> > > > > > > > something very close to what you expect to need (say 2
> req/sec
> > or
> > > > > > > whatever)
> > > > > > > > you would have something quite high (like 100 req/sec) with
> > this
> > > > > meant
> > > > > > to
> > > > > > > > prevent a client gone crazy. I think when used this way
> > allowing
> > > > > those
> > > > > > to
> > > > > > > > be throttled would actually provide meaningful protection.
> > > > > > > >
> > > > > > > > -Jay
> > > > > > > >
> > > > > > > >
> > > > > > > >
> > > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > > > rajinisivaram@gmail.com
> > > > > > > >
> > > > > > > > wrote:
> > > > > > > >
> > > > > > > > > Hi all,
> > > > > > > > >
> > > > > > > > > I have just created KIP-124 to introduce request rate
> quotas
> > to
> > > > > > Kafka:
> > > > > > > > >
> > > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > > > 124+-+Request+rate+quotas
> > > > > > > > >
> > > > > > > > > The proposal is for a simple percentage request handling
> time
> > > > quota
> > > > > > > that
> > > > > > > > > can be allocated to *<client-id>*, *<user>* or *<user,
> > > > client-id>*.
> > > > > > > There
> > > > > > > > > are a few other suggestions also under "Rejected
> > alternatives".
> > > > > > > Feedback
> > > > > > > > > and suggestions are welcome.
> > > > > > > > >
> > > > > > > > > Thank you...
> > > > > > > > >
> > > > > > > > > Regards,
> > > > > > > > >
> > > > > > > > > Rajini
> > > > > > > > >
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > > >
> > > > >
> > > > > --
> > > > > -- Guozhang
> > > > >
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Jun,

Thank you for the review.

I have reverted to the original KIP that throttles based on request handler
utilization. At the moment, it uses percentage, but I am happy to change to
a fraction (out of 1 instead of 100) if required. I have added the examples
from this discussion to the KIP. Also added a "Future Work" section to
address network thread utilization. The configuration is named
"request_time_percent" with the expectation that it can also be used as the
limit for network thread utilization when that is implemented, so that
users have to set only one config for the two and not have to worry about
the internal distribution of the work between the two thread pools in Kafka.


Regards,

Rajini


On Wed, Feb 22, 2017 at 12:23 AM, Jun Rao <ju...@confluent.io> wrote:

> Hi, Rajini,
>
> Thanks for the proposal.
>
> The benefit of using the request processing time over the request rate is
> exactly what people have said. I will just expand that a bit. Consider the
> following case. The producer sends a produce request with a 10MB message
> but compressed to 100KB with gzip. The decompression of the message on the
> broker could take 10-15 seconds, during which time, a request handler
> thread is completely blocked. In this case, neither the byte-in quota nor
> the request rate quota may be effective in protecting the broker. Consider
> another case. A consumer group starts with 10 instances and later on
> switches to 20 instances. The request rate will likely double, but the
> actually load on the broker may not double since each fetch request only
> contains half of the partitions. Request rate quota may not be easy to
> configure in this case.
>
> What we really want is to be able to prevent a client from using too much
> of the server side resources. In this particular KIP, this resource is the
> capacity of the request handler threads. I agree that it may not be
> intuitive for the users to determine how to set the right limit. However,
> this is not completely new and has been done in the container world
> already. For example, Linux cgroup (https://access.redhat.com/
> documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
> Resource_Management_Guide/sec-cpu.html) has the concept of
> cpu.cfs_quota_us,
> which specifies the total amount of time in microseconds for which all
> tasks in a cgroup can run during a one second period. We can potentially
> model the request handler threads in a similar way. For example, each
> request handler thread can be 1 request handler unit and the admin can
> configure a limit on how many units (say 0.01) a client can have.
>
> Regarding not throttling the internal broker to broker requests. We could
> do that. Alternatively, we could just let the admin configure a high limit
> for the kafka user (it may not be able to do that easily based on clientId
> though).
>
> Ideally we want to be able to protect the utilization of the network thread
> pool too. The difficult is mostly what Rajini said: (1) The mechanism for
> throttling the requests is through Purgatory and we will have to think
> through how to integrate that into the network layer.  (2) In the network
> layer, currently we know the user, but not the clientId of the request. So,
> it's a bit tricky to throttle based on clientId there. Plus, the byteOut
> quota can already protect the network thread utilization for fetch
> requests. So, if we can't figure out this part right now, just focusing on
> the request handling threads for this KIP is still a useful feature.
>
> Thanks,
>
> Jun
>
>
> On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Thank you all for the feedback.
> >
> > Jay: I have removed exemption for consumer heartbeat etc. Agree that
> > protecting the cluster is more important than protecting individual apps.
> > Have retained the exemption for StopReplicat/LeaderAndIsr etc, these are
> > throttled only if authorization fails (so can't be used for DoS attacks
> in
> > a secure cluster, but allows inter-broker requests to complete without
> > delays).
> >
> > I will wait another day to see if these is any objection to quotas based
> on
> > request processing time (as opposed to request rate) and if there are no
> > objections, I will revert to the original proposal with some changes.
> >
> > The original proposal was only including the time used by the request
> > handler threads (that made calculation easy). I think the suggestion is
> to
> > include the time spent in the network threads as well since that may be
> > significant. As Jay pointed out, it is more complicated to calculate the
> > total available CPU time and convert to a ratio when there *m* I/O
> threads
> > and *n* network threads. ThreadMXBean#getThreadCPUTime() may give us
> what
> > we want, but it can be very expensive on some platforms. As Becket and
> > Guozhang have pointed out, we do have several time measurements already
> for
> > generating metrics that we could use, though we might want to switch to
> > nanoTime() instead of currentTimeMillis() since some of the values for
> > small requests may be < 1ms. But rather than add up the time spent in I/O
> > thread and network thread, wouldn't it be better to convert the time
> spent
> > on each thread into a separate ratio? UserA has a request quota of 5%.
> Can
> > we take that to mean that UserA can use 5% of the time on network threads
> > and 5% of the time on I/O threads? If either is exceeded, the response is
> > throttled - it would mean maintaining two sets of metrics for the two
> > durations, but would result in more meaningful ratios. We could define
> two
> > quota limits (UserA has 5% of request threads and 10% of network
> threads),
> > but that seems unnecessary and harder to explain to users.
> >
> > Back to why and how quotas are applied to network thread utilization:
> > a) In the case of fetch,  the time spent in the network thread may be
> > significant and I can see the need to include this. Are there other
> > requests where the network thread utilization is significant? In the case
> > of fetch, request handler thread utilization would throttle clients with
> > high request rate, low data volume and fetch byte rate quota will
> throttle
> > clients with high data volume. Network thread utilization is perhaps
> > proportional to the data volume. I am wondering if we even need to
> throttle
> > based on network thread utilization or whether the data volume quota
> covers
> > this case.
> >
> > b) At the moment, we record and check for quota violation at the same
> time.
> > If a quota is violated, the response is delayed. Using Jay'e example of
> > disk reads for fetches happening in the network thread, We can't record
> and
> > delay a response after the disk reads. We could record the time spent on
> > the network thread when the response is complete and introduce a delay
> for
> > handling a subsequent request (separate out recording and quota violation
> > handling in the case of network thread overload). Does that make sense?
> >
> >
> > Regards,
> >
> > Rajini
> >
> >
> > On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <be...@gmail.com>
> wrote:
> >
> > > Hey Jay,
> > >
> > > Yeah, I agree that enforcing the CPU time is a little tricky. I am
> > thinking
> > > that maybe we can use the existing request statistics. They are already
> > > very detailed so we can probably see the approximate CPU time from it,
> > e.g.
> > > something like (total_time - request/response_queue_time -
> remote_time).
> > >
> > > I agree with Guozhang that when a user is throttled it is likely that
> we
> > > need to see if anything has went wrong first, and if the users are well
> > > behaving and just need more resources, we will have to bump up the
> quota
> > > for them. It is true that pre-allocating CPU time quota precisely for
> the
> > > users is difficult. So in practice it would probably be more like first
> > set
> > > a relative high protective CPU time quota for everyone and increase
> that
> > > for some individual clients on demand.
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > >
> > > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <wa...@gmail.com>
> > wrote:
> > >
> > > > This is a great proposal, glad to see it happening.
> > > >
> > > > I am inclined to the CPU throttling, or more specifically processing
> > time
> > > > ratio instead of the request rate throttling as well. Becket has very
> > > well
> > > > summed my rationales above, and one thing to add here is that the
> > former
> > > > has a good support for both "protecting against rogue clients" as
> well
> > as
> > > > "utilizing a cluster for multi-tenancy usage": when thinking about
> how
> > to
> > > > explain this to the end users, I find it actually more natural than
> the
> > > > request rate since as mentioned above, different requests will have
> > quite
> > > > different "cost", and Kafka today already have various request types
> > > > (produce, fetch, admin, metadata, etc), because of that the request
> > rate
> > > > throttling may not be as effective unless it is set very
> > conservatively.
> > > >
> > > > Regarding to user reactions when they are throttled, I think it may
> > > differ
> > > > case-by-case, and need to be discovered / guided by looking at
> relative
> > > > metrics. So in other words users would not expect to get additional
> > > > information by simply being told "hey, you are throttled", which is
> all
> > > > what throttling does; they need to take a follow-up step and see
> "hmm,
> > > I'm
> > > > throttled probably because of ..", which is by looking at other
> metric
> > > > values: e.g. whether I'm bombarding the brokers with metadata
> request,
> > > > which are usually cheap to handle but I'm sending thousands per
> second;
> > > or
> > > > is it because I'm catching up and hence sending very heavy fetching
> > > request
> > > > with large min.bytes, etc.
> > > >
> > > > Regarding to the implementation, as once discussed with Jun, this
> seems
> > > not
> > > > very difficult since today we are already collecting the "thread pool
> > > > utilization" metrics, which is a single percentage
> "aggregateIdleMeter"
> > > > value; but we are already effectively aggregating it for each
> requests
> > in
> > > > KafkaRequestHandler, and we can just extend it by recording the
> source
> > > > client id when handling them and aggregating by clientId as well as
> the
> > > > total aggregate.
> > > >
> > > >
> > > > Guozhang
> > > >
> > > >
> > > >
> > > >
> > > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <ja...@confluent.io> wrote:
> > > >
> > > > > Hey Becket/Rajini,
> > > > >
> > > > > When I thought about it more deeply I came around to the "percent
> of
> > > > > processing time" metric too. It seems a lot closer to the thing we
> > > > actually
> > > > > care about and need to protect. I also think this would be a very
> > > useful
> > > > > metric even in the absence of throttling just to debug whose using
> > > > > capacity.
> > > > >
> > > > > Two problems to consider:
> > > > >
> > > > >    1. I agree that for the user it is understandable what lead to
> > their
> > > > >    being throttled, but it is a bit hard to figure out the safe
> range
> > > for
> > > > >    them. i.e. if I have a new app that will send 200 messages/sec I
> > can
> > > > >    probably reason that I'll be under the throttling limit of 300
> > > > req/sec.
> > > > >    However if I need to be under a 10% CPU resources limit it may
> be
> > a
> > > > bit
> > > > >    harder for me to know a priori if i will or won't.
> > > > >    2. Calculating the available CPU time is a bit difficult since
> > there
> > > > are
> > > > >    actually two thread pools--the I/O threads and the network
> > threads.
> > > I
> > > > > think
> > > > >    it might be workable to count just the I/O thread time as in the
> > > > > proposal,
> > > > >    but the network thread work is actually non-trivial (e.g. all
> the
> > > disk
> > > > >    reads for fetches happen in that thread). If you count both the
> > > > network
> > > > > and
> > > > >    I/O threads it can skew things a bit. E.g. say you have 50
> network
> > > > > threads,
> > > > >    10 I/O threads, and 8 cores, what is the available cpu time
> > > available
> > > > > in a
> > > > >    second? I suppose this is a problem whenever you have a
> bottleneck
> > > > > between
> > > > >    I/O and network threads or if you end up significantly
> > > > over-provisioning
> > > > >    one pool (both of which are hard to avoid).
> > > > >
> > > > > An alternative for CPU throttling would be to use this api:
> > > > > http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/
> > > > > management/ThreadMXBean.html#getThreadCpuTime(long)
> > > > >
> > > > > That would let you track actual CPU usage across the network, I/O
> > > > threads,
> > > > > and purgatory threads and look at it as a percentage of total
> cores.
> > I
> > > > > think this fixes many problems in the reliability of the metric.
> It's
> > > > > meaning is slightly different as it is just CPU (you don't get
> > charged
> > > > for
> > > > > time blocking on I/O) but that may be okay because we already have
> a
> > > > > throttle on I/O. The downside is I think it is possible this api
> can
> > be
> > > > > disabled or isn't always available and it may also be expensive
> (also
> > > > I've
> > > > > never used it so not sure if it really works the way i think).
> > > > >
> > > > > -Jay
> > > > >
> > > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <be...@gmail.com>
> > > > wrote:
> > > > >
> > > > > > If the purpose of the KIP is only to protect the cluster from
> being
> > > > > > overwhelmed by crazy clients and is not intended to address
> > resource
> > > > > > allocation problem among the clients, I am wondering if using
> > request
> > > > > > handling time quota (CPU time quota) is a better option. Here are
> > the
> > > > > > reasons:
> > > > > >
> > > > > > 1. request handling time quota has better protection. Say we have
> > > > request
> > > > > > rate quota and set that to some value like 100 requests/sec, it
> is
> > > > > possible
> > > > > > that some of the requests are very expensive actually take a lot
> of
> > > > time
> > > > > to
> > > > > > handle. In that case a few clients may still occupy a lot of CPU
> > time
> > > > > even
> > > > > > the request rate is low. Arguably we can carefully set request
> rate
> > > > quota
> > > > > > for each request and client id combination, but it could still be
> > > > tricky
> > > > > to
> > > > > > get it right for everyone.
> > > > > >
> > > > > > If we use the request time handling quota, we can simply say no
> > > clients
> > > > > can
> > > > > > take up to more than 30% of the total request handling capacity
> > > > (measured
> > > > > > by time), regardless of the difference among different requests
> or
> > > what
> > > > > is
> > > > > > the client doing. In this case maybe we can quota all the
> requests
> > if
> > > > we
> > > > > > want to.
> > > > > >
> > > > > > 2. The main benefit of using request rate limit is that it seems
> > more
> > > > > > intuitive. It is true that it is probably easier to explain to
> the
> > > user
> > > > > > what does that mean. However, in practice it looks the impact of
> > > > request
> > > > > > rate quota is not more quantifiable than the request handling
> time
> > > > quota.
> > > > > > Unlike the byte rate quota, it is still difficult to give a
> number
> > > > about
> > > > > > impact of throughput or latency when a request rate quota is hit.
> > So
> > > it
> > > > > is
> > > > > > not better than the request handling time quota. In fact I feel
> it
> > is
> > > > > > clearer to tell user that "you are limited because you have taken
> > 30%
> > > > of
> > > > > > the CPU time on the broker" than otherwise something like "your
> > > request
> > > > > > rate quota on metadata request has reached".
> > > > > >
> > > > > > Thanks,
> > > > > >
> > > > > > Jiangjie (Becket) Qin
> > > > > >
> > > > > >
> > > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <ja...@confluent.io>
> > wrote:
> > > > > >
> > > > > > > I think this proposal makes a lot of sense (especially now that
> > it
> > > is
> > > > > > > oriented around request rate) and fills the biggest remaining
> gap
> > > in
> > > > > the
> > > > > > > multi-tenancy story.
> > > > > > >
> > > > > > > I think for intra-cluster communication (StopReplica, etc) we
> > could
> > > > > avoid
> > > > > > > throttling entirely. You can secure or otherwise lock-down the
> > > > cluster
> > > > > > > communication to avoid any unauthorized external party from
> > trying
> > > to
> > > > > > > initiate these requests. As a result we are as likely to cause
> > > > problems
> > > > > > as
> > > > > > > solve them by throttling these, right?
> > > > > > >
> > > > > > > I'm not so sure that we should exempt the consumer requests
> such
> > as
> > > > > > > heartbeat. It's true that if we throttle an app's heartbeat
> > > requests
> > > > it
> > > > > > may
> > > > > > > cause it to fall out of its consumer group. However if we don't
> > > > > throttle
> > > > > > it
> > > > > > > it may DDOS the cluster if the heartbeat interval is set
> > > incorrectly
> > > > or
> > > > > > if
> > > > > > > some client in some language has a bug. I think the policy with
> > > this
> > > > > kind
> > > > > > > of throttling is to protect the cluster above any individual
> app,
> > > > > right?
> > > > > > I
> > > > > > > think in general this should be okay since for most deployments
> > > this
> > > > > > > setting is meant as more of a safety valve---that is rather
> than
> > > set
> > > > > > > something very close to what you expect to need (say 2 req/sec
> or
> > > > > > whatever)
> > > > > > > you would have something quite high (like 100 req/sec) with
> this
> > > > meant
> > > > > to
> > > > > > > prevent a client gone crazy. I think when used this way
> allowing
> > > > those
> > > > > to
> > > > > > > be throttled would actually provide meaningful protection.
> > > > > > >
> > > > > > > -Jay
> > > > > > >
> > > > > > >
> > > > > > >
> > > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > > rajinisivaram@gmail.com
> > > > > > >
> > > > > > > wrote:
> > > > > > >
> > > > > > > > Hi all,
> > > > > > > >
> > > > > > > > I have just created KIP-124 to introduce request rate quotas
> to
> > > > > Kafka:
> > > > > > > >
> > > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > > 124+-+Request+rate+quotas
> > > > > > > >
> > > > > > > > The proposal is for a simple percentage request handling time
> > > quota
> > > > > > that
> > > > > > > > can be allocated to *<client-id>*, *<user>* or *<user,
> > > client-id>*.
> > > > > > There
> > > > > > > > are a few other suggestions also under "Rejected
> alternatives".
> > > > > > Feedback
> > > > > > > > and suggestions are welcome.
> > > > > > > >
> > > > > > > > Thank you...
> > > > > > > >
> > > > > > > > Regards,
> > > > > > > >
> > > > > > > > Rajini
> > > > > > > >
> > > > > > >
> > > > > >
> > > > >
> > > >
> > > >
> > > >
> > > > --
> > > > -- Guozhang
> > > >
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jun Rao <ju...@confluent.io>.
Hi, Rajini,

Thanks for the proposal.

The benefit of using the request processing time over the request rate is
exactly what people have said. I will just expand that a bit. Consider the
following case. The producer sends a produce request with a 10MB message
but compressed to 100KB with gzip. The decompression of the message on the
broker could take 10-15 seconds, during which time, a request handler
thread is completely blocked. In this case, neither the byte-in quota nor
the request rate quota may be effective in protecting the broker. Consider
another case. A consumer group starts with 10 instances and later on
switches to 20 instances. The request rate will likely double, but the
actually load on the broker may not double since each fetch request only
contains half of the partitions. Request rate quota may not be easy to
configure in this case.

What we really want is to be able to prevent a client from using too much
of the server side resources. In this particular KIP, this resource is the
capacity of the request handler threads. I agree that it may not be
intuitive for the users to determine how to set the right limit. However,
this is not completely new and has been done in the container world
already. For example, Linux cgroup (https://access.redhat.com/
documentation/en-US/Red_Hat_Enterprise_Linux/6/html/
Resource_Management_Guide/sec-cpu.html) has the concept of cpu.cfs_quota_us,
which specifies the total amount of time in microseconds for which all
tasks in a cgroup can run during a one second period. We can potentially
model the request handler threads in a similar way. For example, each
request handler thread can be 1 request handler unit and the admin can
configure a limit on how many units (say 0.01) a client can have.

Regarding not throttling the internal broker to broker requests. We could
do that. Alternatively, we could just let the admin configure a high limit
for the kafka user (it may not be able to do that easily based on clientId
though).

Ideally we want to be able to protect the utilization of the network thread
pool too. The difficult is mostly what Rajini said: (1) The mechanism for
throttling the requests is through Purgatory and we will have to think
through how to integrate that into the network layer.  (2) In the network
layer, currently we know the user, but not the clientId of the request. So,
it's a bit tricky to throttle based on clientId there. Plus, the byteOut
quota can already protect the network thread utilization for fetch
requests. So, if we can't figure out this part right now, just focusing on
the request handling threads for this KIP is still a useful feature.

Thanks,

Jun


On Tue, Feb 21, 2017 at 4:27 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Thank you all for the feedback.
>
> Jay: I have removed exemption for consumer heartbeat etc. Agree that
> protecting the cluster is more important than protecting individual apps.
> Have retained the exemption for StopReplicat/LeaderAndIsr etc, these are
> throttled only if authorization fails (so can't be used for DoS attacks in
> a secure cluster, but allows inter-broker requests to complete without
> delays).
>
> I will wait another day to see if these is any objection to quotas based on
> request processing time (as opposed to request rate) and if there are no
> objections, I will revert to the original proposal with some changes.
>
> The original proposal was only including the time used by the request
> handler threads (that made calculation easy). I think the suggestion is to
> include the time spent in the network threads as well since that may be
> significant. As Jay pointed out, it is more complicated to calculate the
> total available CPU time and convert to a ratio when there *m* I/O threads
> and *n* network threads. ThreadMXBean#getThreadCPUTime() may give us what
> we want, but it can be very expensive on some platforms. As Becket and
> Guozhang have pointed out, we do have several time measurements already for
> generating metrics that we could use, though we might want to switch to
> nanoTime() instead of currentTimeMillis() since some of the values for
> small requests may be < 1ms. But rather than add up the time spent in I/O
> thread and network thread, wouldn't it be better to convert the time spent
> on each thread into a separate ratio? UserA has a request quota of 5%. Can
> we take that to mean that UserA can use 5% of the time on network threads
> and 5% of the time on I/O threads? If either is exceeded, the response is
> throttled - it would mean maintaining two sets of metrics for the two
> durations, but would result in more meaningful ratios. We could define two
> quota limits (UserA has 5% of request threads and 10% of network threads),
> but that seems unnecessary and harder to explain to users.
>
> Back to why and how quotas are applied to network thread utilization:
> a) In the case of fetch,  the time spent in the network thread may be
> significant and I can see the need to include this. Are there other
> requests where the network thread utilization is significant? In the case
> of fetch, request handler thread utilization would throttle clients with
> high request rate, low data volume and fetch byte rate quota will throttle
> clients with high data volume. Network thread utilization is perhaps
> proportional to the data volume. I am wondering if we even need to throttle
> based on network thread utilization or whether the data volume quota covers
> this case.
>
> b) At the moment, we record and check for quota violation at the same time.
> If a quota is violated, the response is delayed. Using Jay'e example of
> disk reads for fetches happening in the network thread, We can't record and
> delay a response after the disk reads. We could record the time spent on
> the network thread when the response is complete and introduce a delay for
> handling a subsequent request (separate out recording and quota violation
> handling in the case of network thread overload). Does that make sense?
>
>
> Regards,
>
> Rajini
>
>
> On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <be...@gmail.com> wrote:
>
> > Hey Jay,
> >
> > Yeah, I agree that enforcing the CPU time is a little tricky. I am
> thinking
> > that maybe we can use the existing request statistics. They are already
> > very detailed so we can probably see the approximate CPU time from it,
> e.g.
> > something like (total_time - request/response_queue_time - remote_time).
> >
> > I agree with Guozhang that when a user is throttled it is likely that we
> > need to see if anything has went wrong first, and if the users are well
> > behaving and just need more resources, we will have to bump up the quota
> > for them. It is true that pre-allocating CPU time quota precisely for the
> > users is difficult. So in practice it would probably be more like first
> set
> > a relative high protective CPU time quota for everyone and increase that
> > for some individual clients on demand.
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> > On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <wa...@gmail.com>
> wrote:
> >
> > > This is a great proposal, glad to see it happening.
> > >
> > > I am inclined to the CPU throttling, or more specifically processing
> time
> > > ratio instead of the request rate throttling as well. Becket has very
> > well
> > > summed my rationales above, and one thing to add here is that the
> former
> > > has a good support for both "protecting against rogue clients" as well
> as
> > > "utilizing a cluster for multi-tenancy usage": when thinking about how
> to
> > > explain this to the end users, I find it actually more natural than the
> > > request rate since as mentioned above, different requests will have
> quite
> > > different "cost", and Kafka today already have various request types
> > > (produce, fetch, admin, metadata, etc), because of that the request
> rate
> > > throttling may not be as effective unless it is set very
> conservatively.
> > >
> > > Regarding to user reactions when they are throttled, I think it may
> > differ
> > > case-by-case, and need to be discovered / guided by looking at relative
> > > metrics. So in other words users would not expect to get additional
> > > information by simply being told "hey, you are throttled", which is all
> > > what throttling does; they need to take a follow-up step and see "hmm,
> > I'm
> > > throttled probably because of ..", which is by looking at other metric
> > > values: e.g. whether I'm bombarding the brokers with metadata request,
> > > which are usually cheap to handle but I'm sending thousands per second;
> > or
> > > is it because I'm catching up and hence sending very heavy fetching
> > request
> > > with large min.bytes, etc.
> > >
> > > Regarding to the implementation, as once discussed with Jun, this seems
> > not
> > > very difficult since today we are already collecting the "thread pool
> > > utilization" metrics, which is a single percentage "aggregateIdleMeter"
> > > value; but we are already effectively aggregating it for each requests
> in
> > > KafkaRequestHandler, and we can just extend it by recording the source
> > > client id when handling them and aggregating by clientId as well as the
> > > total aggregate.
> > >
> > >
> > > Guozhang
> > >
> > >
> > >
> > >
> > > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <ja...@confluent.io> wrote:
> > >
> > > > Hey Becket/Rajini,
> > > >
> > > > When I thought about it more deeply I came around to the "percent of
> > > > processing time" metric too. It seems a lot closer to the thing we
> > > actually
> > > > care about and need to protect. I also think this would be a very
> > useful
> > > > metric even in the absence of throttling just to debug whose using
> > > > capacity.
> > > >
> > > > Two problems to consider:
> > > >
> > > >    1. I agree that for the user it is understandable what lead to
> their
> > > >    being throttled, but it is a bit hard to figure out the safe range
> > for
> > > >    them. i.e. if I have a new app that will send 200 messages/sec I
> can
> > > >    probably reason that I'll be under the throttling limit of 300
> > > req/sec.
> > > >    However if I need to be under a 10% CPU resources limit it may be
> a
> > > bit
> > > >    harder for me to know a priori if i will or won't.
> > > >    2. Calculating the available CPU time is a bit difficult since
> there
> > > are
> > > >    actually two thread pools--the I/O threads and the network
> threads.
> > I
> > > > think
> > > >    it might be workable to count just the I/O thread time as in the
> > > > proposal,
> > > >    but the network thread work is actually non-trivial (e.g. all the
> > disk
> > > >    reads for fetches happen in that thread). If you count both the
> > > network
> > > > and
> > > >    I/O threads it can skew things a bit. E.g. say you have 50 network
> > > > threads,
> > > >    10 I/O threads, and 8 cores, what is the available cpu time
> > available
> > > > in a
> > > >    second? I suppose this is a problem whenever you have a bottleneck
> > > > between
> > > >    I/O and network threads or if you end up significantly
> > > over-provisioning
> > > >    one pool (both of which are hard to avoid).
> > > >
> > > > An alternative for CPU throttling would be to use this api:
> > > > http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/
> > > > management/ThreadMXBean.html#getThreadCpuTime(long)
> > > >
> > > > That would let you track actual CPU usage across the network, I/O
> > > threads,
> > > > and purgatory threads and look at it as a percentage of total cores.
> I
> > > > think this fixes many problems in the reliability of the metric. It's
> > > > meaning is slightly different as it is just CPU (you don't get
> charged
> > > for
> > > > time blocking on I/O) but that may be okay because we already have a
> > > > throttle on I/O. The downside is I think it is possible this api can
> be
> > > > disabled or isn't always available and it may also be expensive (also
> > > I've
> > > > never used it so not sure if it really works the way i think).
> > > >
> > > > -Jay
> > > >
> > > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <be...@gmail.com>
> > > wrote:
> > > >
> > > > > If the purpose of the KIP is only to protect the cluster from being
> > > > > overwhelmed by crazy clients and is not intended to address
> resource
> > > > > allocation problem among the clients, I am wondering if using
> request
> > > > > handling time quota (CPU time quota) is a better option. Here are
> the
> > > > > reasons:
> > > > >
> > > > > 1. request handling time quota has better protection. Say we have
> > > request
> > > > > rate quota and set that to some value like 100 requests/sec, it is
> > > > possible
> > > > > that some of the requests are very expensive actually take a lot of
> > > time
> > > > to
> > > > > handle. In that case a few clients may still occupy a lot of CPU
> time
> > > > even
> > > > > the request rate is low. Arguably we can carefully set request rate
> > > quota
> > > > > for each request and client id combination, but it could still be
> > > tricky
> > > > to
> > > > > get it right for everyone.
> > > > >
> > > > > If we use the request time handling quota, we can simply say no
> > clients
> > > > can
> > > > > take up to more than 30% of the total request handling capacity
> > > (measured
> > > > > by time), regardless of the difference among different requests or
> > what
> > > > is
> > > > > the client doing. In this case maybe we can quota all the requests
> if
> > > we
> > > > > want to.
> > > > >
> > > > > 2. The main benefit of using request rate limit is that it seems
> more
> > > > > intuitive. It is true that it is probably easier to explain to the
> > user
> > > > > what does that mean. However, in practice it looks the impact of
> > > request
> > > > > rate quota is not more quantifiable than the request handling time
> > > quota.
> > > > > Unlike the byte rate quota, it is still difficult to give a number
> > > about
> > > > > impact of throughput or latency when a request rate quota is hit.
> So
> > it
> > > > is
> > > > > not better than the request handling time quota. In fact I feel it
> is
> > > > > clearer to tell user that "you are limited because you have taken
> 30%
> > > of
> > > > > the CPU time on the broker" than otherwise something like "your
> > request
> > > > > rate quota on metadata request has reached".
> > > > >
> > > > > Thanks,
> > > > >
> > > > > Jiangjie (Becket) Qin
> > > > >
> > > > >
> > > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <ja...@confluent.io>
> wrote:
> > > > >
> > > > > > I think this proposal makes a lot of sense (especially now that
> it
> > is
> > > > > > oriented around request rate) and fills the biggest remaining gap
> > in
> > > > the
> > > > > > multi-tenancy story.
> > > > > >
> > > > > > I think for intra-cluster communication (StopReplica, etc) we
> could
> > > > avoid
> > > > > > throttling entirely. You can secure or otherwise lock-down the
> > > cluster
> > > > > > communication to avoid any unauthorized external party from
> trying
> > to
> > > > > > initiate these requests. As a result we are as likely to cause
> > > problems
> > > > > as
> > > > > > solve them by throttling these, right?
> > > > > >
> > > > > > I'm not so sure that we should exempt the consumer requests such
> as
> > > > > > heartbeat. It's true that if we throttle an app's heartbeat
> > requests
> > > it
> > > > > may
> > > > > > cause it to fall out of its consumer group. However if we don't
> > > > throttle
> > > > > it
> > > > > > it may DDOS the cluster if the heartbeat interval is set
> > incorrectly
> > > or
> > > > > if
> > > > > > some client in some language has a bug. I think the policy with
> > this
> > > > kind
> > > > > > of throttling is to protect the cluster above any individual app,
> > > > right?
> > > > > I
> > > > > > think in general this should be okay since for most deployments
> > this
> > > > > > setting is meant as more of a safety valve---that is rather than
> > set
> > > > > > something very close to what you expect to need (say 2 req/sec or
> > > > > whatever)
> > > > > > you would have something quite high (like 100 req/sec) with this
> > > meant
> > > > to
> > > > > > prevent a client gone crazy. I think when used this way allowing
> > > those
> > > > to
> > > > > > be throttled would actually provide meaningful protection.
> > > > > >
> > > > > > -Jay
> > > > > >
> > > > > >
> > > > > >
> > > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > > rajinisivaram@gmail.com
> > > > > >
> > > > > > wrote:
> > > > > >
> > > > > > > Hi all,
> > > > > > >
> > > > > > > I have just created KIP-124 to introduce request rate quotas to
> > > > Kafka:
> > > > > > >
> > > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > > 124+-+Request+rate+quotas
> > > > > > >
> > > > > > > The proposal is for a simple percentage request handling time
> > quota
> > > > > that
> > > > > > > can be allocated to *<client-id>*, *<user>* or *<user,
> > client-id>*.
> > > > > There
> > > > > > > are a few other suggestions also under "Rejected alternatives".
> > > > > Feedback
> > > > > > > and suggestions are welcome.
> > > > > > >
> > > > > > > Thank you...
> > > > > > >
> > > > > > > Regards,
> > > > > > >
> > > > > > > Rajini
> > > > > > >
> > > > > >
> > > > >
> > > >
> > >
> > >
> > >
> > > --
> > > -- Guozhang
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Rajini Sivaram <ra...@gmail.com>.
Thank you all for the feedback.

Jay: I have removed exemption for consumer heartbeat etc. Agree that
protecting the cluster is more important than protecting individual apps.
Have retained the exemption for StopReplicat/LeaderAndIsr etc, these are
throttled only if authorization fails (so can't be used for DoS attacks in
a secure cluster, but allows inter-broker requests to complete without
delays).

I will wait another day to see if these is any objection to quotas based on
request processing time (as opposed to request rate) and if there are no
objections, I will revert to the original proposal with some changes.

The original proposal was only including the time used by the request
handler threads (that made calculation easy). I think the suggestion is to
include the time spent in the network threads as well since that may be
significant. As Jay pointed out, it is more complicated to calculate the
total available CPU time and convert to a ratio when there *m* I/O threads
and *n* network threads. ThreadMXBean#getThreadCPUTime() may give us what
we want, but it can be very expensive on some platforms. As Becket and
Guozhang have pointed out, we do have several time measurements already for
generating metrics that we could use, though we might want to switch to
nanoTime() instead of currentTimeMillis() since some of the values for
small requests may be < 1ms. But rather than add up the time spent in I/O
thread and network thread, wouldn't it be better to convert the time spent
on each thread into a separate ratio? UserA has a request quota of 5%. Can
we take that to mean that UserA can use 5% of the time on network threads
and 5% of the time on I/O threads? If either is exceeded, the response is
throttled - it would mean maintaining two sets of metrics for the two
durations, but would result in more meaningful ratios. We could define two
quota limits (UserA has 5% of request threads and 10% of network threads),
but that seems unnecessary and harder to explain to users.

Back to why and how quotas are applied to network thread utilization:
a) In the case of fetch,  the time spent in the network thread may be
significant and I can see the need to include this. Are there other
requests where the network thread utilization is significant? In the case
of fetch, request handler thread utilization would throttle clients with
high request rate, low data volume and fetch byte rate quota will throttle
clients with high data volume. Network thread utilization is perhaps
proportional to the data volume. I am wondering if we even need to throttle
based on network thread utilization or whether the data volume quota covers
this case.

b) At the moment, we record and check for quota violation at the same time.
If a quota is violated, the response is delayed. Using Jay'e example of
disk reads for fetches happening in the network thread, We can't record and
delay a response after the disk reads. We could record the time spent on
the network thread when the response is complete and introduce a delay for
handling a subsequent request (separate out recording and quota violation
handling in the case of network thread overload). Does that make sense?


Regards,

Rajini


On Tue, Feb 21, 2017 at 2:58 AM, Becket Qin <be...@gmail.com> wrote:

> Hey Jay,
>
> Yeah, I agree that enforcing the CPU time is a little tricky. I am thinking
> that maybe we can use the existing request statistics. They are already
> very detailed so we can probably see the approximate CPU time from it, e.g.
> something like (total_time - request/response_queue_time - remote_time).
>
> I agree with Guozhang that when a user is throttled it is likely that we
> need to see if anything has went wrong first, and if the users are well
> behaving and just need more resources, we will have to bump up the quota
> for them. It is true that pre-allocating CPU time quota precisely for the
> users is difficult. So in practice it would probably be more like first set
> a relative high protective CPU time quota for everyone and increase that
> for some individual clients on demand.
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
> On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <wa...@gmail.com> wrote:
>
> > This is a great proposal, glad to see it happening.
> >
> > I am inclined to the CPU throttling, or more specifically processing time
> > ratio instead of the request rate throttling as well. Becket has very
> well
> > summed my rationales above, and one thing to add here is that the former
> > has a good support for both "protecting against rogue clients" as well as
> > "utilizing a cluster for multi-tenancy usage": when thinking about how to
> > explain this to the end users, I find it actually more natural than the
> > request rate since as mentioned above, different requests will have quite
> > different "cost", and Kafka today already have various request types
> > (produce, fetch, admin, metadata, etc), because of that the request rate
> > throttling may not be as effective unless it is set very conservatively.
> >
> > Regarding to user reactions when they are throttled, I think it may
> differ
> > case-by-case, and need to be discovered / guided by looking at relative
> > metrics. So in other words users would not expect to get additional
> > information by simply being told "hey, you are throttled", which is all
> > what throttling does; they need to take a follow-up step and see "hmm,
> I'm
> > throttled probably because of ..", which is by looking at other metric
> > values: e.g. whether I'm bombarding the brokers with metadata request,
> > which are usually cheap to handle but I'm sending thousands per second;
> or
> > is it because I'm catching up and hence sending very heavy fetching
> request
> > with large min.bytes, etc.
> >
> > Regarding to the implementation, as once discussed with Jun, this seems
> not
> > very difficult since today we are already collecting the "thread pool
> > utilization" metrics, which is a single percentage "aggregateIdleMeter"
> > value; but we are already effectively aggregating it for each requests in
> > KafkaRequestHandler, and we can just extend it by recording the source
> > client id when handling them and aggregating by clientId as well as the
> > total aggregate.
> >
> >
> > Guozhang
> >
> >
> >
> >
> > On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <ja...@confluent.io> wrote:
> >
> > > Hey Becket/Rajini,
> > >
> > > When I thought about it more deeply I came around to the "percent of
> > > processing time" metric too. It seems a lot closer to the thing we
> > actually
> > > care about and need to protect. I also think this would be a very
> useful
> > > metric even in the absence of throttling just to debug whose using
> > > capacity.
> > >
> > > Two problems to consider:
> > >
> > >    1. I agree that for the user it is understandable what lead to their
> > >    being throttled, but it is a bit hard to figure out the safe range
> for
> > >    them. i.e. if I have a new app that will send 200 messages/sec I can
> > >    probably reason that I'll be under the throttling limit of 300
> > req/sec.
> > >    However if I need to be under a 10% CPU resources limit it may be a
> > bit
> > >    harder for me to know a priori if i will or won't.
> > >    2. Calculating the available CPU time is a bit difficult since there
> > are
> > >    actually two thread pools--the I/O threads and the network threads.
> I
> > > think
> > >    it might be workable to count just the I/O thread time as in the
> > > proposal,
> > >    but the network thread work is actually non-trivial (e.g. all the
> disk
> > >    reads for fetches happen in that thread). If you count both the
> > network
> > > and
> > >    I/O threads it can skew things a bit. E.g. say you have 50 network
> > > threads,
> > >    10 I/O threads, and 8 cores, what is the available cpu time
> available
> > > in a
> > >    second? I suppose this is a problem whenever you have a bottleneck
> > > between
> > >    I/O and network threads or if you end up significantly
> > over-provisioning
> > >    one pool (both of which are hard to avoid).
> > >
> > > An alternative for CPU throttling would be to use this api:
> > > http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/
> > > management/ThreadMXBean.html#getThreadCpuTime(long)
> > >
> > > That would let you track actual CPU usage across the network, I/O
> > threads,
> > > and purgatory threads and look at it as a percentage of total cores. I
> > > think this fixes many problems in the reliability of the metric. It's
> > > meaning is slightly different as it is just CPU (you don't get charged
> > for
> > > time blocking on I/O) but that may be okay because we already have a
> > > throttle on I/O. The downside is I think it is possible this api can be
> > > disabled or isn't always available and it may also be expensive (also
> > I've
> > > never used it so not sure if it really works the way i think).
> > >
> > > -Jay
> > >
> > > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <be...@gmail.com>
> > wrote:
> > >
> > > > If the purpose of the KIP is only to protect the cluster from being
> > > > overwhelmed by crazy clients and is not intended to address resource
> > > > allocation problem among the clients, I am wondering if using request
> > > > handling time quota (CPU time quota) is a better option. Here are the
> > > > reasons:
> > > >
> > > > 1. request handling time quota has better protection. Say we have
> > request
> > > > rate quota and set that to some value like 100 requests/sec, it is
> > > possible
> > > > that some of the requests are very expensive actually take a lot of
> > time
> > > to
> > > > handle. In that case a few clients may still occupy a lot of CPU time
> > > even
> > > > the request rate is low. Arguably we can carefully set request rate
> > quota
> > > > for each request and client id combination, but it could still be
> > tricky
> > > to
> > > > get it right for everyone.
> > > >
> > > > If we use the request time handling quota, we can simply say no
> clients
> > > can
> > > > take up to more than 30% of the total request handling capacity
> > (measured
> > > > by time), regardless of the difference among different requests or
> what
> > > is
> > > > the client doing. In this case maybe we can quota all the requests if
> > we
> > > > want to.
> > > >
> > > > 2. The main benefit of using request rate limit is that it seems more
> > > > intuitive. It is true that it is probably easier to explain to the
> user
> > > > what does that mean. However, in practice it looks the impact of
> > request
> > > > rate quota is not more quantifiable than the request handling time
> > quota.
> > > > Unlike the byte rate quota, it is still difficult to give a number
> > about
> > > > impact of throughput or latency when a request rate quota is hit. So
> it
> > > is
> > > > not better than the request handling time quota. In fact I feel it is
> > > > clearer to tell user that "you are limited because you have taken 30%
> > of
> > > > the CPU time on the broker" than otherwise something like "your
> request
> > > > rate quota on metadata request has reached".
> > > >
> > > > Thanks,
> > > >
> > > > Jiangjie (Becket) Qin
> > > >
> > > >
> > > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <ja...@confluent.io> wrote:
> > > >
> > > > > I think this proposal makes a lot of sense (especially now that it
> is
> > > > > oriented around request rate) and fills the biggest remaining gap
> in
> > > the
> > > > > multi-tenancy story.
> > > > >
> > > > > I think for intra-cluster communication (StopReplica, etc) we could
> > > avoid
> > > > > throttling entirely. You can secure or otherwise lock-down the
> > cluster
> > > > > communication to avoid any unauthorized external party from trying
> to
> > > > > initiate these requests. As a result we are as likely to cause
> > problems
> > > > as
> > > > > solve them by throttling these, right?
> > > > >
> > > > > I'm not so sure that we should exempt the consumer requests such as
> > > > > heartbeat. It's true that if we throttle an app's heartbeat
> requests
> > it
> > > > may
> > > > > cause it to fall out of its consumer group. However if we don't
> > > throttle
> > > > it
> > > > > it may DDOS the cluster if the heartbeat interval is set
> incorrectly
> > or
> > > > if
> > > > > some client in some language has a bug. I think the policy with
> this
> > > kind
> > > > > of throttling is to protect the cluster above any individual app,
> > > right?
> > > > I
> > > > > think in general this should be okay since for most deployments
> this
> > > > > setting is meant as more of a safety valve---that is rather than
> set
> > > > > something very close to what you expect to need (say 2 req/sec or
> > > > whatever)
> > > > > you would have something quite high (like 100 req/sec) with this
> > meant
> > > to
> > > > > prevent a client gone crazy. I think when used this way allowing
> > those
> > > to
> > > > > be throttled would actually provide meaningful protection.
> > > > >
> > > > > -Jay
> > > > >
> > > > >
> > > > >
> > > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > > rajinisivaram@gmail.com
> > > > >
> > > > > wrote:
> > > > >
> > > > > > Hi all,
> > > > > >
> > > > > > I have just created KIP-124 to introduce request rate quotas to
> > > Kafka:
> > > > > >
> > > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > > 124+-+Request+rate+quotas
> > > > > >
> > > > > > The proposal is for a simple percentage request handling time
> quota
> > > > that
> > > > > > can be allocated to *<client-id>*, *<user>* or *<user,
> client-id>*.
> > > > There
> > > > > > are a few other suggestions also under "Rejected alternatives".
> > > > Feedback
> > > > > > and suggestions are welcome.
> > > > > >
> > > > > > Thank you...
> > > > > >
> > > > > > Regards,
> > > > > >
> > > > > > Rajini
> > > > > >
> > > > >
> > > >
> > >
> >
> >
> >
> > --
> > -- Guozhang
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Becket Qin <be...@gmail.com>.
Hey Jay,

Yeah, I agree that enforcing the CPU time is a little tricky. I am thinking
that maybe we can use the existing request statistics. They are already
very detailed so we can probably see the approximate CPU time from it, e.g.
something like (total_time - request/response_queue_time - remote_time).

I agree with Guozhang that when a user is throttled it is likely that we
need to see if anything has went wrong first, and if the users are well
behaving and just need more resources, we will have to bump up the quota
for them. It is true that pre-allocating CPU time quota precisely for the
users is difficult. So in practice it would probably be more like first set
a relative high protective CPU time quota for everyone and increase that
for some individual clients on demand.

Thanks,

Jiangjie (Becket) Qin


On Mon, Feb 20, 2017 at 5:48 PM, Guozhang Wang <wa...@gmail.com> wrote:

> This is a great proposal, glad to see it happening.
>
> I am inclined to the CPU throttling, or more specifically processing time
> ratio instead of the request rate throttling as well. Becket has very well
> summed my rationales above, and one thing to add here is that the former
> has a good support for both "protecting against rogue clients" as well as
> "utilizing a cluster for multi-tenancy usage": when thinking about how to
> explain this to the end users, I find it actually more natural than the
> request rate since as mentioned above, different requests will have quite
> different "cost", and Kafka today already have various request types
> (produce, fetch, admin, metadata, etc), because of that the request rate
> throttling may not be as effective unless it is set very conservatively.
>
> Regarding to user reactions when they are throttled, I think it may differ
> case-by-case, and need to be discovered / guided by looking at relative
> metrics. So in other words users would not expect to get additional
> information by simply being told "hey, you are throttled", which is all
> what throttling does; they need to take a follow-up step and see "hmm, I'm
> throttled probably because of ..", which is by looking at other metric
> values: e.g. whether I'm bombarding the brokers with metadata request,
> which are usually cheap to handle but I'm sending thousands per second; or
> is it because I'm catching up and hence sending very heavy fetching request
> with large min.bytes, etc.
>
> Regarding to the implementation, as once discussed with Jun, this seems not
> very difficult since today we are already collecting the "thread pool
> utilization" metrics, which is a single percentage "aggregateIdleMeter"
> value; but we are already effectively aggregating it for each requests in
> KafkaRequestHandler, and we can just extend it by recording the source
> client id when handling them and aggregating by clientId as well as the
> total aggregate.
>
>
> Guozhang
>
>
>
>
> On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <ja...@confluent.io> wrote:
>
> > Hey Becket/Rajini,
> >
> > When I thought about it more deeply I came around to the "percent of
> > processing time" metric too. It seems a lot closer to the thing we
> actually
> > care about and need to protect. I also think this would be a very useful
> > metric even in the absence of throttling just to debug whose using
> > capacity.
> >
> > Two problems to consider:
> >
> >    1. I agree that for the user it is understandable what lead to their
> >    being throttled, but it is a bit hard to figure out the safe range for
> >    them. i.e. if I have a new app that will send 200 messages/sec I can
> >    probably reason that I'll be under the throttling limit of 300
> req/sec.
> >    However if I need to be under a 10% CPU resources limit it may be a
> bit
> >    harder for me to know a priori if i will or won't.
> >    2. Calculating the available CPU time is a bit difficult since there
> are
> >    actually two thread pools--the I/O threads and the network threads. I
> > think
> >    it might be workable to count just the I/O thread time as in the
> > proposal,
> >    but the network thread work is actually non-trivial (e.g. all the disk
> >    reads for fetches happen in that thread). If you count both the
> network
> > and
> >    I/O threads it can skew things a bit. E.g. say you have 50 network
> > threads,
> >    10 I/O threads, and 8 cores, what is the available cpu time available
> > in a
> >    second? I suppose this is a problem whenever you have a bottleneck
> > between
> >    I/O and network threads or if you end up significantly
> over-provisioning
> >    one pool (both of which are hard to avoid).
> >
> > An alternative for CPU throttling would be to use this api:
> > http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/
> > management/ThreadMXBean.html#getThreadCpuTime(long)
> >
> > That would let you track actual CPU usage across the network, I/O
> threads,
> > and purgatory threads and look at it as a percentage of total cores. I
> > think this fixes many problems in the reliability of the metric. It's
> > meaning is slightly different as it is just CPU (you don't get charged
> for
> > time blocking on I/O) but that may be okay because we already have a
> > throttle on I/O. The downside is I think it is possible this api can be
> > disabled or isn't always available and it may also be expensive (also
> I've
> > never used it so not sure if it really works the way i think).
> >
> > -Jay
> >
> > On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <be...@gmail.com>
> wrote:
> >
> > > If the purpose of the KIP is only to protect the cluster from being
> > > overwhelmed by crazy clients and is not intended to address resource
> > > allocation problem among the clients, I am wondering if using request
> > > handling time quota (CPU time quota) is a better option. Here are the
> > > reasons:
> > >
> > > 1. request handling time quota has better protection. Say we have
> request
> > > rate quota and set that to some value like 100 requests/sec, it is
> > possible
> > > that some of the requests are very expensive actually take a lot of
> time
> > to
> > > handle. In that case a few clients may still occupy a lot of CPU time
> > even
> > > the request rate is low. Arguably we can carefully set request rate
> quota
> > > for each request and client id combination, but it could still be
> tricky
> > to
> > > get it right for everyone.
> > >
> > > If we use the request time handling quota, we can simply say no clients
> > can
> > > take up to more than 30% of the total request handling capacity
> (measured
> > > by time), regardless of the difference among different requests or what
> > is
> > > the client doing. In this case maybe we can quota all the requests if
> we
> > > want to.
> > >
> > > 2. The main benefit of using request rate limit is that it seems more
> > > intuitive. It is true that it is probably easier to explain to the user
> > > what does that mean. However, in practice it looks the impact of
> request
> > > rate quota is not more quantifiable than the request handling time
> quota.
> > > Unlike the byte rate quota, it is still difficult to give a number
> about
> > > impact of throughput or latency when a request rate quota is hit. So it
> > is
> > > not better than the request handling time quota. In fact I feel it is
> > > clearer to tell user that "you are limited because you have taken 30%
> of
> > > the CPU time on the broker" than otherwise something like "your request
> > > rate quota on metadata request has reached".
> > >
> > > Thanks,
> > >
> > > Jiangjie (Becket) Qin
> > >
> > >
> > > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <ja...@confluent.io> wrote:
> > >
> > > > I think this proposal makes a lot of sense (especially now that it is
> > > > oriented around request rate) and fills the biggest remaining gap in
> > the
> > > > multi-tenancy story.
> > > >
> > > > I think for intra-cluster communication (StopReplica, etc) we could
> > avoid
> > > > throttling entirely. You can secure or otherwise lock-down the
> cluster
> > > > communication to avoid any unauthorized external party from trying to
> > > > initiate these requests. As a result we are as likely to cause
> problems
> > > as
> > > > solve them by throttling these, right?
> > > >
> > > > I'm not so sure that we should exempt the consumer requests such as
> > > > heartbeat. It's true that if we throttle an app's heartbeat requests
> it
> > > may
> > > > cause it to fall out of its consumer group. However if we don't
> > throttle
> > > it
> > > > it may DDOS the cluster if the heartbeat interval is set incorrectly
> or
> > > if
> > > > some client in some language has a bug. I think the policy with this
> > kind
> > > > of throttling is to protect the cluster above any individual app,
> > right?
> > > I
> > > > think in general this should be okay since for most deployments this
> > > > setting is meant as more of a safety valve---that is rather than set
> > > > something very close to what you expect to need (say 2 req/sec or
> > > whatever)
> > > > you would have something quite high (like 100 req/sec) with this
> meant
> > to
> > > > prevent a client gone crazy. I think when used this way allowing
> those
> > to
> > > > be throttled would actually provide meaningful protection.
> > > >
> > > > -Jay
> > > >
> > > >
> > > >
> > > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> > rajinisivaram@gmail.com
> > > >
> > > > wrote:
> > > >
> > > > > Hi all,
> > > > >
> > > > > I have just created KIP-124 to introduce request rate quotas to
> > Kafka:
> > > > >
> > > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > > 124+-+Request+rate+quotas
> > > > >
> > > > > The proposal is for a simple percentage request handling time quota
> > > that
> > > > > can be allocated to *<client-id>*, *<user>* or *<user, client-id>*.
> > > There
> > > > > are a few other suggestions also under "Rejected alternatives".
> > > Feedback
> > > > > and suggestions are welcome.
> > > > >
> > > > > Thank you...
> > > > >
> > > > > Regards,
> > > > >
> > > > > Rajini
> > > > >
> > > >
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Guozhang Wang <wa...@gmail.com>.
This is a great proposal, glad to see it happening.

I am inclined to the CPU throttling, or more specifically processing time
ratio instead of the request rate throttling as well. Becket has very well
summed my rationales above, and one thing to add here is that the former
has a good support for both "protecting against rogue clients" as well as
"utilizing a cluster for multi-tenancy usage": when thinking about how to
explain this to the end users, I find it actually more natural than the
request rate since as mentioned above, different requests will have quite
different "cost", and Kafka today already have various request types
(produce, fetch, admin, metadata, etc), because of that the request rate
throttling may not be as effective unless it is set very conservatively.

Regarding to user reactions when they are throttled, I think it may differ
case-by-case, and need to be discovered / guided by looking at relative
metrics. So in other words users would not expect to get additional
information by simply being told "hey, you are throttled", which is all
what throttling does; they need to take a follow-up step and see "hmm, I'm
throttled probably because of ..", which is by looking at other metric
values: e.g. whether I'm bombarding the brokers with metadata request,
which are usually cheap to handle but I'm sending thousands per second; or
is it because I'm catching up and hence sending very heavy fetching request
with large min.bytes, etc.

Regarding to the implementation, as once discussed with Jun, this seems not
very difficult since today we are already collecting the "thread pool
utilization" metrics, which is a single percentage "aggregateIdleMeter"
value; but we are already effectively aggregating it for each requests in
KafkaRequestHandler, and we can just extend it by recording the source
client id when handling them and aggregating by clientId as well as the
total aggregate.


Guozhang




On Mon, Feb 20, 2017 at 4:27 PM, Jay Kreps <ja...@confluent.io> wrote:

> Hey Becket/Rajini,
>
> When I thought about it more deeply I came around to the "percent of
> processing time" metric too. It seems a lot closer to the thing we actually
> care about and need to protect. I also think this would be a very useful
> metric even in the absence of throttling just to debug whose using
> capacity.
>
> Two problems to consider:
>
>    1. I agree that for the user it is understandable what lead to their
>    being throttled, but it is a bit hard to figure out the safe range for
>    them. i.e. if I have a new app that will send 200 messages/sec I can
>    probably reason that I'll be under the throttling limit of 300 req/sec.
>    However if I need to be under a 10% CPU resources limit it may be a bit
>    harder for me to know a priori if i will or won't.
>    2. Calculating the available CPU time is a bit difficult since there are
>    actually two thread pools--the I/O threads and the network threads. I
> think
>    it might be workable to count just the I/O thread time as in the
> proposal,
>    but the network thread work is actually non-trivial (e.g. all the disk
>    reads for fetches happen in that thread). If you count both the network
> and
>    I/O threads it can skew things a bit. E.g. say you have 50 network
> threads,
>    10 I/O threads, and 8 cores, what is the available cpu time available
> in a
>    second? I suppose this is a problem whenever you have a bottleneck
> between
>    I/O and network threads or if you end up significantly over-provisioning
>    one pool (both of which are hard to avoid).
>
> An alternative for CPU throttling would be to use this api:
> http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/
> management/ThreadMXBean.html#getThreadCpuTime(long)
>
> That would let you track actual CPU usage across the network, I/O threads,
> and purgatory threads and look at it as a percentage of total cores. I
> think this fixes many problems in the reliability of the metric. It's
> meaning is slightly different as it is just CPU (you don't get charged for
> time blocking on I/O) but that may be okay because we already have a
> throttle on I/O. The downside is I think it is possible this api can be
> disabled or isn't always available and it may also be expensive (also I've
> never used it so not sure if it really works the way i think).
>
> -Jay
>
> On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <be...@gmail.com> wrote:
>
> > If the purpose of the KIP is only to protect the cluster from being
> > overwhelmed by crazy clients and is not intended to address resource
> > allocation problem among the clients, I am wondering if using request
> > handling time quota (CPU time quota) is a better option. Here are the
> > reasons:
> >
> > 1. request handling time quota has better protection. Say we have request
> > rate quota and set that to some value like 100 requests/sec, it is
> possible
> > that some of the requests are very expensive actually take a lot of time
> to
> > handle. In that case a few clients may still occupy a lot of CPU time
> even
> > the request rate is low. Arguably we can carefully set request rate quota
> > for each request and client id combination, but it could still be tricky
> to
> > get it right for everyone.
> >
> > If we use the request time handling quota, we can simply say no clients
> can
> > take up to more than 30% of the total request handling capacity (measured
> > by time), regardless of the difference among different requests or what
> is
> > the client doing. In this case maybe we can quota all the requests if we
> > want to.
> >
> > 2. The main benefit of using request rate limit is that it seems more
> > intuitive. It is true that it is probably easier to explain to the user
> > what does that mean. However, in practice it looks the impact of request
> > rate quota is not more quantifiable than the request handling time quota.
> > Unlike the byte rate quota, it is still difficult to give a number about
> > impact of throughput or latency when a request rate quota is hit. So it
> is
> > not better than the request handling time quota. In fact I feel it is
> > clearer to tell user that "you are limited because you have taken 30% of
> > the CPU time on the broker" than otherwise something like "your request
> > rate quota on metadata request has reached".
> >
> > Thanks,
> >
> > Jiangjie (Becket) Qin
> >
> >
> > On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <ja...@confluent.io> wrote:
> >
> > > I think this proposal makes a lot of sense (especially now that it is
> > > oriented around request rate) and fills the biggest remaining gap in
> the
> > > multi-tenancy story.
> > >
> > > I think for intra-cluster communication (StopReplica, etc) we could
> avoid
> > > throttling entirely. You can secure or otherwise lock-down the cluster
> > > communication to avoid any unauthorized external party from trying to
> > > initiate these requests. As a result we are as likely to cause problems
> > as
> > > solve them by throttling these, right?
> > >
> > > I'm not so sure that we should exempt the consumer requests such as
> > > heartbeat. It's true that if we throttle an app's heartbeat requests it
> > may
> > > cause it to fall out of its consumer group. However if we don't
> throttle
> > it
> > > it may DDOS the cluster if the heartbeat interval is set incorrectly or
> > if
> > > some client in some language has a bug. I think the policy with this
> kind
> > > of throttling is to protect the cluster above any individual app,
> right?
> > I
> > > think in general this should be okay since for most deployments this
> > > setting is meant as more of a safety valve---that is rather than set
> > > something very close to what you expect to need (say 2 req/sec or
> > whatever)
> > > you would have something quite high (like 100 req/sec) with this meant
> to
> > > prevent a client gone crazy. I think when used this way allowing those
> to
> > > be throttled would actually provide meaningful protection.
> > >
> > > -Jay
> > >
> > >
> > >
> > > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <
> rajinisivaram@gmail.com
> > >
> > > wrote:
> > >
> > > > Hi all,
> > > >
> > > > I have just created KIP-124 to introduce request rate quotas to
> Kafka:
> > > >
> > > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > > 124+-+Request+rate+quotas
> > > >
> > > > The proposal is for a simple percentage request handling time quota
> > that
> > > > can be allocated to *<client-id>*, *<user>* or *<user, client-id>*.
> > There
> > > > are a few other suggestions also under "Rejected alternatives".
> > Feedback
> > > > and suggestions are welcome.
> > > >
> > > > Thank you...
> > > >
> > > > Regards,
> > > >
> > > > Rajini
> > > >
> > >
> >
>



-- 
-- Guozhang

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jay Kreps <ja...@confluent.io>.
Hey Becket/Rajini,

When I thought about it more deeply I came around to the "percent of
processing time" metric too. It seems a lot closer to the thing we actually
care about and need to protect. I also think this would be a very useful
metric even in the absence of throttling just to debug whose using capacity.

Two problems to consider:

   1. I agree that for the user it is understandable what lead to their
   being throttled, but it is a bit hard to figure out the safe range for
   them. i.e. if I have a new app that will send 200 messages/sec I can
   probably reason that I'll be under the throttling limit of 300 req/sec.
   However if I need to be under a 10% CPU resources limit it may be a bit
   harder for me to know a priori if i will or won't.
   2. Calculating the available CPU time is a bit difficult since there are
   actually two thread pools--the I/O threads and the network threads. I think
   it might be workable to count just the I/O thread time as in the proposal,
   but the network thread work is actually non-trivial (e.g. all the disk
   reads for fetches happen in that thread). If you count both the network and
   I/O threads it can skew things a bit. E.g. say you have 50 network threads,
   10 I/O threads, and 8 cores, what is the available cpu time available in a
   second? I suppose this is a problem whenever you have a bottleneck between
   I/O and network threads or if you end up significantly over-provisioning
   one pool (both of which are hard to avoid).

An alternative for CPU throttling would be to use this api:
http://docs.oracle.com/javase/1.5.0/docs/api/java/lang/management/ThreadMXBean.html#getThreadCpuTime(long)

That would let you track actual CPU usage across the network, I/O threads,
and purgatory threads and look at it as a percentage of total cores. I
think this fixes many problems in the reliability of the metric. It's
meaning is slightly different as it is just CPU (you don't get charged for
time blocking on I/O) but that may be okay because we already have a
throttle on I/O. The downside is I think it is possible this api can be
disabled or isn't always available and it may also be expensive (also I've
never used it so not sure if it really works the way i think).

-Jay

On Mon, Feb 20, 2017 at 3:17 PM, Becket Qin <be...@gmail.com> wrote:

> If the purpose of the KIP is only to protect the cluster from being
> overwhelmed by crazy clients and is not intended to address resource
> allocation problem among the clients, I am wondering if using request
> handling time quota (CPU time quota) is a better option. Here are the
> reasons:
>
> 1. request handling time quota has better protection. Say we have request
> rate quota and set that to some value like 100 requests/sec, it is possible
> that some of the requests are very expensive actually take a lot of time to
> handle. In that case a few clients may still occupy a lot of CPU time even
> the request rate is low. Arguably we can carefully set request rate quota
> for each request and client id combination, but it could still be tricky to
> get it right for everyone.
>
> If we use the request time handling quota, we can simply say no clients can
> take up to more than 30% of the total request handling capacity (measured
> by time), regardless of the difference among different requests or what is
> the client doing. In this case maybe we can quota all the requests if we
> want to.
>
> 2. The main benefit of using request rate limit is that it seems more
> intuitive. It is true that it is probably easier to explain to the user
> what does that mean. However, in practice it looks the impact of request
> rate quota is not more quantifiable than the request handling time quota.
> Unlike the byte rate quota, it is still difficult to give a number about
> impact of throughput or latency when a request rate quota is hit. So it is
> not better than the request handling time quota. In fact I feel it is
> clearer to tell user that "you are limited because you have taken 30% of
> the CPU time on the broker" than otherwise something like "your request
> rate quota on metadata request has reached".
>
> Thanks,
>
> Jiangjie (Becket) Qin
>
>
> On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <ja...@confluent.io> wrote:
>
> > I think this proposal makes a lot of sense (especially now that it is
> > oriented around request rate) and fills the biggest remaining gap in the
> > multi-tenancy story.
> >
> > I think for intra-cluster communication (StopReplica, etc) we could avoid
> > throttling entirely. You can secure or otherwise lock-down the cluster
> > communication to avoid any unauthorized external party from trying to
> > initiate these requests. As a result we are as likely to cause problems
> as
> > solve them by throttling these, right?
> >
> > I'm not so sure that we should exempt the consumer requests such as
> > heartbeat. It's true that if we throttle an app's heartbeat requests it
> may
> > cause it to fall out of its consumer group. However if we don't throttle
> it
> > it may DDOS the cluster if the heartbeat interval is set incorrectly or
> if
> > some client in some language has a bug. I think the policy with this kind
> > of throttling is to protect the cluster above any individual app, right?
> I
> > think in general this should be okay since for most deployments this
> > setting is meant as more of a safety valve---that is rather than set
> > something very close to what you expect to need (say 2 req/sec or
> whatever)
> > you would have something quite high (like 100 req/sec) with this meant to
> > prevent a client gone crazy. I think when used this way allowing those to
> > be throttled would actually provide meaningful protection.
> >
> > -Jay
> >
> >
> >
> > On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <rajinisivaram@gmail.com
> >
> > wrote:
> >
> > > Hi all,
> > >
> > > I have just created KIP-124 to introduce request rate quotas to Kafka:
> > >
> > > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > > 124+-+Request+rate+quotas
> > >
> > > The proposal is for a simple percentage request handling time quota
> that
> > > can be allocated to *<client-id>*, *<user>* or *<user, client-id>*.
> There
> > > are a few other suggestions also under "Rejected alternatives".
> Feedback
> > > and suggestions are welcome.
> > >
> > > Thank you...
> > >
> > > Regards,
> > >
> > > Rajini
> > >
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Becket Qin <be...@gmail.com>.
If the purpose of the KIP is only to protect the cluster from being
overwhelmed by crazy clients and is not intended to address resource
allocation problem among the clients, I am wondering if using request
handling time quota (CPU time quota) is a better option. Here are the
reasons:

1. request handling time quota has better protection. Say we have request
rate quota and set that to some value like 100 requests/sec, it is possible
that some of the requests are very expensive actually take a lot of time to
handle. In that case a few clients may still occupy a lot of CPU time even
the request rate is low. Arguably we can carefully set request rate quota
for each request and client id combination, but it could still be tricky to
get it right for everyone.

If we use the request time handling quota, we can simply say no clients can
take up to more than 30% of the total request handling capacity (measured
by time), regardless of the difference among different requests or what is
the client doing. In this case maybe we can quota all the requests if we
want to.

2. The main benefit of using request rate limit is that it seems more
intuitive. It is true that it is probably easier to explain to the user
what does that mean. However, in practice it looks the impact of request
rate quota is not more quantifiable than the request handling time quota.
Unlike the byte rate quota, it is still difficult to give a number about
impact of throughput or latency when a request rate quota is hit. So it is
not better than the request handling time quota. In fact I feel it is
clearer to tell user that "you are limited because you have taken 30% of
the CPU time on the broker" than otherwise something like "your request
rate quota on metadata request has reached".

Thanks,

Jiangjie (Becket) Qin


On Mon, Feb 20, 2017 at 2:23 PM, Jay Kreps <ja...@confluent.io> wrote:

> I think this proposal makes a lot of sense (especially now that it is
> oriented around request rate) and fills the biggest remaining gap in the
> multi-tenancy story.
>
> I think for intra-cluster communication (StopReplica, etc) we could avoid
> throttling entirely. You can secure or otherwise lock-down the cluster
> communication to avoid any unauthorized external party from trying to
> initiate these requests. As a result we are as likely to cause problems as
> solve them by throttling these, right?
>
> I'm not so sure that we should exempt the consumer requests such as
> heartbeat. It's true that if we throttle an app's heartbeat requests it may
> cause it to fall out of its consumer group. However if we don't throttle it
> it may DDOS the cluster if the heartbeat interval is set incorrectly or if
> some client in some language has a bug. I think the policy with this kind
> of throttling is to protect the cluster above any individual app, right? I
> think in general this should be okay since for most deployments this
> setting is meant as more of a safety valve---that is rather than set
> something very close to what you expect to need (say 2 req/sec or whatever)
> you would have something quite high (like 100 req/sec) with this meant to
> prevent a client gone crazy. I think when used this way allowing those to
> be throttled would actually provide meaningful protection.
>
> -Jay
>
>
>
> On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <ra...@gmail.com>
> wrote:
>
> > Hi all,
> >
> > I have just created KIP-124 to introduce request rate quotas to Kafka:
> >
> > https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> > 124+-+Request+rate+quotas
> >
> > The proposal is for a simple percentage request handling time quota that
> > can be allocated to *<client-id>*, *<user>* or *<user, client-id>*. There
> > are a few other suggestions also under "Rejected alternatives". Feedback
> > and suggestions are welcome.
> >
> > Thank you...
> >
> > Regards,
> >
> > Rajini
> >
>

Re: [DISCUSS] KIP-124: Request rate quotas

Posted by Jay Kreps <ja...@confluent.io>.
I think this proposal makes a lot of sense (especially now that it is
oriented around request rate) and fills the biggest remaining gap in the
multi-tenancy story.

I think for intra-cluster communication (StopReplica, etc) we could avoid
throttling entirely. You can secure or otherwise lock-down the cluster
communication to avoid any unauthorized external party from trying to
initiate these requests. As a result we are as likely to cause problems as
solve them by throttling these, right?

I'm not so sure that we should exempt the consumer requests such as
heartbeat. It's true that if we throttle an app's heartbeat requests it may
cause it to fall out of its consumer group. However if we don't throttle it
it may DDOS the cluster if the heartbeat interval is set incorrectly or if
some client in some language has a bug. I think the policy with this kind
of throttling is to protect the cluster above any individual app, right? I
think in general this should be okay since for most deployments this
setting is meant as more of a safety valve---that is rather than set
something very close to what you expect to need (say 2 req/sec or whatever)
you would have something quite high (like 100 req/sec) with this meant to
prevent a client gone crazy. I think when used this way allowing those to
be throttled would actually provide meaningful protection.

-Jay



On Fri, Feb 17, 2017 at 9:05 AM, Rajini Sivaram <ra...@gmail.com>
wrote:

> Hi all,
>
> I have just created KIP-124 to introduce request rate quotas to Kafka:
>
> https://cwiki.apache.org/confluence/display/KAFKA/KIP-
> 124+-+Request+rate+quotas
>
> The proposal is for a simple percentage request handling time quota that
> can be allocated to *<client-id>*, *<user>* or *<user, client-id>*. There
> are a few other suggestions also under "Rejected alternatives". Feedback
> and suggestions are welcome.
>
> Thank you...
>
> Regards,
>
> Rajini
>