You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by David Ballano Fernandez <df...@demonware.net> on 2022/03/02 00:06:59 UTC

Re: consumer hpa autoscaling

Thanks Liam,

I am trying hpa but using cpu utilization, but since everything is tied to
partition number etc i wonder what the benefits of running on hpa really
are.

thanks!

On Mon, Feb 28, 2022 at 12:59 PM Liam Clarke-Hutchinson <lc...@redhat.com>
wrote:

> I've used HPAs scaling on lag before by feeding lag metrics from Prometheus
> into the K8s metrics server as custom metrics.
>
> That said, you need to carefully control scaling frequency to avoid
> excessive consumer group rebalances. The cooperative sticky assignor can
> minimise pauses, but not remove them entirely.
>
> There's a lot of knobs you can use to tune HPAs these days:
>
> https://urldefense.proofpoint.com/v2/url?u=https-3A__kubernetes.io_docs_tasks_run-2Dapplication_horizontal-2Dpod-2Dautoscale_-23configurable-2Dscaling-2Dbehavior&d=DwIBaQ&c=qE8EibqjfXM-zBfebVhd4gtjNZbrDcrKYXvb1gt38s4&r=p-f3AJg4e4Uk20g_16kSyBtabT4JOB-1GIb23_CxD58&m=dQzp4x9JZe-7YZcgrSl3YrB3X7PYTM_bS4caOQ59hLLonNXE0x3TveYTXVAFcxco&s=_NU3o8FG8CwNpe8wl3mVxXkNeEx_9aCD2_md1riEZa0&e=
>
> Good luck :)
>
>
>
> On Tue, 1 Mar 2022 at 08:49, David Ballano Fernandez <
> dfernandez@demonware.net> wrote:
>
> > Hello Guys,
> >
> > I was wondering how you guys do autoscaling of you consumers in
> kubernetes
> > if you do any.
> >
> > We have a mirrormaker-like app that mirrors data from cluster to cluster
> at
> > the same time does some topic routing.  I would like to add hpa to the
> app
> > in order to scale up/down depending on avg cpu. but as you know  a
> consumer
> > app has lots of variables being partitions of topics being consumed  a
> > pretty important one.
> >
> > Since kubernetes checks cpu avg, there are chances that pods/consumers
> > won't be scaled up to the  number of partitions possibly creating some
> hot
> > spots.
> >
> > Anyways i would like to know how you deal if you do at all with this.
> >
> > thanks!
> >
>

Re: consumer hpa autoscaling

Posted by David Ballano Fernandez <df...@demonware.net>.

Thanks,  also  thanks Fares for pointing me to Keda.

On Sun, Mar 6, 2022 at 4:18 PM Liam Clarke-Hutchinson <lc...@redhat.com>
wrote:

> > I was trying to see what the goals of enabling Hpa on the consumer would
> be. Since like you say there is a partition upper limit which will limit
> the consumer throughput. so in the end you have to tweak partitions on
> kafka and then reassess the maxReplicas config of hpa.
> It seems hpa in this scenario would help more around costs than operations
> around the app.
>
> Yep, the goal of using an HPA with 1 to N instances of a consuming app is
> to scale consumers out at peak load, and then scale them down when load's a
> lot lower.
> It helps meet any data timeliness requirements you might have during high
> load, and as you said, reduces costs during low load.
>
> On Sat, 5 Mar 2022 at 07:09, David Ballano Fernandez <
> dfernandez@demonware.net> wrote:
>
> > HI Liam,
> >
> > I was trying to see what the goals of enabling Hpa on the consumer would
> > be. Since like you say there is a partition upper limit which will limit
> > the consumer throughput. so in the end you have to tweak partitions on
> > kafka and then reassess the maxReplicas config of hpa.
> > It seems hpa in this scenario would help more around costs than
> operations
> > around the app.
> >
> > Maybe there is a way to build your own algorithm to figure out max/min
> > replicas and other fanciness depending on partitions (via an operator)
> etc.
> >
> > But I wonder if you would still end up in the same boat, plus does it
> make
> > sense to over engineer this when in the end you might have to add
> > partitions manually? That is why I like HPA, since it's "simple" and you
> > can easily understand the behaviour.
> > The behaviour of this app like you say is seasonal. so it has peaks and
> > troughs everyday so there are some benefits to running Hpa there.
> >
> > About consumer group rebalances, yeah I get what you mean. I did tweak
> some
> > scale up/down policies to make it smoother. The app seems fine but I
> might
> > enable cooperative-sticky just to see if that helps a bit more. but so
> far
> > I am not seeing a negative impact on the app.
> >
> > this is what i am using on hpa so far, nothing complex:
> >
> > spec:
> >   scaleTargetRef:
> >     apiVersion: apps/v1
> >     kind: Deployment
> >     name: app-staging-test
> >   minReplicas: 56
> >   maxReplicas: 224
> >   behavior:
> >     scaleUp:
> >       stabilizationWindowSeconds: 60
> >       policies:
> >       - type: Percent
> >         value: 100
> >         periodSeconds: 60
> >   metrics:
> >     - resource:
> >         name: cpu
> >         target:
> >           averageUtilization: 30
> >           type: Utilization
> >       type: Resource
> >
> >
> >
> > Thanks!
> >
> > On Wed, Mar 2, 2022 at 12:15 AM Liam Clarke-Hutchinson <
> > lclarkeh@redhat.com>
> > wrote:
> >
> > > Hi David,
> > >
> > > Scaling on CPU can be fine, what you scale on depends on what resource
> > > constrains your consuming application. CPU is a good proxy for "I'm
> > working
> > > really hard", so not a bad one to start with.
> > >
> > > Main thing to be aware of is tuning the HPA to minimise scaling that
> > causes
> > > "stop-the-world" consumer group rebalances, the documentation I linked
> > > earlier offers good advice. But you'll need to determine what is the
> best
> > > way to configure your HPA based on your particular workloads - in other
> > > words, a lot of trial and error. :)
> > >
> > > In terms of "everything is tied to partition number", there is an
> obvious
> > > upper limit when scaling consumers in a consumer group - if you have 20
> > > partitions on a topic, a consumer group consuming from that topic will
> > only
> > > increase throughput when scaling up to 20 instances. If you have 30
> > > instances, 10 instances won't be assigned partitions unless some of the
> > > other instances fail.
> > >
> > > However, the real advantage of an HPA is in reducing cost / load,
> > > especially in a cloud environment - if the throughput on a given topic
> is
> > > low, and one consumer can easily handle all 20 partitions, then you're
> > > wasting money running 19 other instances. But if throughput suddenly
> > > increases, the HPA will let your consumer instances scale up
> > automatically,
> > > and then scal down when the throughput drops again.
> > >
> > > It really depends on how throughput on your topic varies - if you're
> > > working in a domain where throughtput shows high seasonality over the
> day
> > > (e.g., at 4am in the morning, no-one is using your website, at 8pm,
> > > everyone is using it) then an HPA approach is ideal. But, as I said,
> > you'll
> > > need to tune how your HPA scales to prevent repeated scaling up and
> down
> > > that interferes with the consumer group over all.
> > >
> > > If you have any more details on what problem you're trying to solve, I
> > > might be able to give more specific advice.
> > >
> > > TL;DR - I've found using HPAs to scale applications in the same
> consumer
> > > group is very useful, but it needs to be tuned to minimise scaling that
> > can
> > > cause pauses in consumption.
> > >
> > > Kind regards,
> > >
> > > Liam Clarke-Hutchinson
> > >
> > >
> > >
> > > On Wed, 2 Mar 2022 at 13:14, David Ballano Fernandez <
> > > dfernandez@demonware.net> wrote:
> > >
> > > > Thanks Liam,
> > > >
> > > > I am trying hpa but using cpu utilization, but since everything is
> tied
> > > to
> > > > partition number etc i wonder what the benefits of running on hpa
> > really
> > > > are.
> > > >
> > > > thanks!
> > > >
> > > > On Mon, Feb 28, 2022 at 12:59 PM Liam Clarke-Hutchinson <
> > > > lclarkeh@redhat.com>
> > > > wrote:
> > > >
> > > > > I've used HPAs scaling on lag before by feeding lag metrics from
> > > > Prometheus
> > > > > into the K8s metrics server as custom metrics.
> > > > >
> > > > > That said, you need to carefully control scaling frequency to avoid
> > > > > excessive consumer group rebalances. The cooperative sticky
> assignor
> > > can
> > > > > minimise pauses, but not remove them entirely.
> > > > >
> > > > > There's a lot of knobs you can use to tune HPAs these days:
> > > > >
> > > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__kubernetes.io_docs_tasks_run-2Dapplication_horizontal-2Dpod-2Dautoscale_-23configurable-2Dscaling-2Dbehavior&d=DwIBaQ&c=qE8EibqjfXM-zBfebVhd4gtjNZbrDcrKYXvb1gt38s4&r=p-f3AJg4e4Uk20g_16kSyBtabT4JOB-1GIb23_CxD58&m=dQzp4x9JZe-7YZcgrSl3YrB3X7PYTM_bS4caOQ59hLLonNXE0x3TveYTXVAFcxco&s=_NU3o8FG8CwNpe8wl3mVxXkNeEx_9aCD2_md1riEZa0&e=
> > > > >
> > > > > Good luck :)
> > > > >
> > > > >
> > > > >
> > > > > On Tue, 1 Mar 2022 at 08:49, David Ballano Fernandez <
> > > > > dfernandez@demonware.net> wrote:
> > > > >
> > > > > > Hello Guys,
> > > > > >
> > > > > > I was wondering how you guys do autoscaling of you consumers in
> > > > > kubernetes
> > > > > > if you do any.
> > > > > >
> > > > > > We have a mirrormaker-like app that mirrors data from cluster to
> > > > cluster
> > > > > at
> > > > > > the same time does some topic routing.  I would like to add hpa
> to
> > > the
> > > > > app
> > > > > > in order to scale up/down depending on avg cpu. but as you know
> a
> > > > > consumer
> > > > > > app has lots of variables being partitions of topics being
> consumed
> > > a
> > > > > > pretty important one.
> > > > > >
> > > > > > Since kubernetes checks cpu avg, there are chances that
> > > pods/consumers
> > > > > > won't be scaled up to the  number of partitions possibly creating
> > > some
> > > > > hot
> > > > > > spots.
> > > > > >
> > > > > > Anyways i would like to know how you deal if you do at all with
> > this.
> > > > > >
> > > > > > thanks!
> > > > > >
> > > > >
> > > >
> > >
> >
>

Re: consumer hpa autoscaling

Posted by Liam Clarke-Hutchinson <lc...@redhat.com>.

> I was trying to see what the goals of enabling Hpa on the consumer would
be. Since like you say there is a partition upper limit which will limit
the consumer throughput. so in the end you have to tweak partitions on
kafka and then reassess the maxReplicas config of hpa.
It seems hpa in this scenario would help more around costs than operations
around the app.

Yep, the goal of using an HPA with 1 to N instances of a consuming app is
to scale consumers out at peak load, and then scale them down when load's a
lot lower.
It helps meet any data timeliness requirements you might have during high
load, and as you said, reduces costs during low load.

On Sat, 5 Mar 2022 at 07:09, David Ballano Fernandez <
dfernandez@demonware.net> wrote:

> HI Liam,
>
> I was trying to see what the goals of enabling Hpa on the consumer would
> be. Since like you say there is a partition upper limit which will limit
> the consumer throughput. so in the end you have to tweak partitions on
> kafka and then reassess the maxReplicas config of hpa.
> It seems hpa in this scenario would help more around costs than operations
> around the app.
>
> Maybe there is a way to build your own algorithm to figure out max/min
> replicas and other fanciness depending on partitions (via an operator) etc.
>
> But I wonder if you would still end up in the same boat, plus does it make
> sense to over engineer this when in the end you might have to add
> partitions manually? That is why I like HPA, since it's "simple" and you
> can easily understand the behaviour.
> The behaviour of this app like you say is seasonal. so it has peaks and
> troughs everyday so there are some benefits to running Hpa there.
>
> About consumer group rebalances, yeah I get what you mean. I did tweak some
> scale up/down policies to make it smoother. The app seems fine but I might
> enable cooperative-sticky just to see if that helps a bit more. but so far
> I am not seeing a negative impact on the app.
>
> this is what i am using on hpa so far, nothing complex:
>
> spec:
>   scaleTargetRef:
>     apiVersion: apps/v1
>     kind: Deployment
>     name: app-staging-test
>   minReplicas: 56
>   maxReplicas: 224
>   behavior:
>     scaleUp:
>       stabilizationWindowSeconds: 60
>       policies:
>       - type: Percent
>         value: 100
>         periodSeconds: 60
>   metrics:
>     - resource:
>         name: cpu
>         target:
>           averageUtilization: 30
>           type: Utilization
>       type: Resource
>
>
>
> Thanks!
>
> On Wed, Mar 2, 2022 at 12:15 AM Liam Clarke-Hutchinson <
> lclarkeh@redhat.com>
> wrote:
>
> > Hi David,
> >
> > Scaling on CPU can be fine, what you scale on depends on what resource
> > constrains your consuming application. CPU is a good proxy for "I'm
> working
> > really hard", so not a bad one to start with.
> >
> > Main thing to be aware of is tuning the HPA to minimise scaling that
> causes
> > "stop-the-world" consumer group rebalances, the documentation I linked
> > earlier offers good advice. But you'll need to determine what is the best
> > way to configure your HPA based on your particular workloads - in other
> > words, a lot of trial and error. :)
> >
> > In terms of "everything is tied to partition number", there is an obvious
> > upper limit when scaling consumers in a consumer group - if you have 20
> > partitions on a topic, a consumer group consuming from that topic will
> only
> > increase throughput when scaling up to 20 instances. If you have 30
> > instances, 10 instances won't be assigned partitions unless some of the
> > other instances fail.
> >
> > However, the real advantage of an HPA is in reducing cost / load,
> > especially in a cloud environment - if the throughput on a given topic is
> > low, and one consumer can easily handle all 20 partitions, then you're
> > wasting money running 19 other instances. But if throughput suddenly
> > increases, the HPA will let your consumer instances scale up
> automatically,
> > and then scal down when the throughput drops again.
> >
> > It really depends on how throughput on your topic varies - if you're
> > working in a domain where throughtput shows high seasonality over the day
> > (e.g., at 4am in the morning, no-one is using your website, at 8pm,
> > everyone is using it) then an HPA approach is ideal. But, as I said,
> you'll
> > need to tune how your HPA scales to prevent repeated scaling up and down
> > that interferes with the consumer group over all.
> >
> > If you have any more details on what problem you're trying to solve, I
> > might be able to give more specific advice.
> >
> > TL;DR - I've found using HPAs to scale applications in the same consumer
> > group is very useful, but it needs to be tuned to minimise scaling that
> can
> > cause pauses in consumption.
> >
> > Kind regards,
> >
> > Liam Clarke-Hutchinson
> >
> >
> >
> > On Wed, 2 Mar 2022 at 13:14, David Ballano Fernandez <
> > dfernandez@demonware.net> wrote:
> >
> > > Thanks Liam,
> > >
> > > I am trying hpa but using cpu utilization, but since everything is tied
> > to
> > > partition number etc i wonder what the benefits of running on hpa
> really
> > > are.
> > >
> > > thanks!
> > >
> > > On Mon, Feb 28, 2022 at 12:59 PM Liam Clarke-Hutchinson <
> > > lclarkeh@redhat.com>
> > > wrote:
> > >
> > > > I've used HPAs scaling on lag before by feeding lag metrics from
> > > Prometheus
> > > > into the K8s metrics server as custom metrics.
> > > >
> > > > That said, you need to carefully control scaling frequency to avoid
> > > > excessive consumer group rebalances. The cooperative sticky assignor
> > can
> > > > minimise pauses, but not remove them entirely.
> > > >
> > > > There's a lot of knobs you can use to tune HPAs these days:
> > > >
> > > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__kubernetes.io_docs_tasks_run-2Dapplication_horizontal-2Dpod-2Dautoscale_-23configurable-2Dscaling-2Dbehavior&d=DwIBaQ&c=qE8EibqjfXM-zBfebVhd4gtjNZbrDcrKYXvb1gt38s4&r=p-f3AJg4e4Uk20g_16kSyBtabT4JOB-1GIb23_CxD58&m=dQzp4x9JZe-7YZcgrSl3YrB3X7PYTM_bS4caOQ59hLLonNXE0x3TveYTXVAFcxco&s=_NU3o8FG8CwNpe8wl3mVxXkNeEx_9aCD2_md1riEZa0&e=
> > > >
> > > > Good luck :)
> > > >
> > > >
> > > >
> > > > On Tue, 1 Mar 2022 at 08:49, David Ballano Fernandez <
> > > > dfernandez@demonware.net> wrote:
> > > >
> > > > > Hello Guys,
> > > > >
> > > > > I was wondering how you guys do autoscaling of you consumers in
> > > > kubernetes
> > > > > if you do any.
> > > > >
> > > > > We have a mirrormaker-like app that mirrors data from cluster to
> > > cluster
> > > > at
> > > > > the same time does some topic routing.  I would like to add hpa to
> > the
> > > > app
> > > > > in order to scale up/down depending on avg cpu. but as you know  a
> > > > consumer
> > > > > app has lots of variables being partitions of topics being consumed
> > a
> > > > > pretty important one.
> > > > >
> > > > > Since kubernetes checks cpu avg, there are chances that
> > pods/consumers
> > > > > won't be scaled up to the  number of partitions possibly creating
> > some
> > > > hot
> > > > > spots.
> > > > >
> > > > > Anyways i would like to know how you deal if you do at all with
> this.
> > > > >
> > > > > thanks!
> > > > >
> > > >
> > >
> >
>

Re: consumer hpa autoscaling

Posted by David Ballano Fernandez <df...@demonware.net>.

HI Liam,

I was trying to see what the goals of enabling Hpa on the consumer would
be. Since like you say there is a partition upper limit which will limit
the consumer throughput. so in the end you have to tweak partitions on
kafka and then reassess the maxReplicas config of hpa.
It seems hpa in this scenario would help more around costs than operations
around the app.

Maybe there is a way to build your own algorithm to figure out max/min
replicas and other fanciness depending on partitions (via an operator) etc.

But I wonder if you would still end up in the same boat, plus does it make
sense to over engineer this when in the end you might have to add
partitions manually? That is why I like HPA, since it's "simple" and you
can easily understand the behaviour.
The behaviour of this app like you say is seasonal. so it has peaks and
troughs everyday so there are some benefits to running Hpa there.

About consumer group rebalances, yeah I get what you mean. I did tweak some
scale up/down policies to make it smoother. The app seems fine but I might
enable cooperative-sticky just to see if that helps a bit more. but so far
I am not seeing a negative impact on the app.

this is what i am using on hpa so far, nothing complex:

spec:
  scaleTargetRef:
    apiVersion: apps/v1
    kind: Deployment
    name: app-staging-test
  minReplicas: 56
  maxReplicas: 224
  behavior:
    scaleUp:
      stabilizationWindowSeconds: 60
      policies:
      - type: Percent
        value: 100
        periodSeconds: 60
  metrics:
    - resource:
        name: cpu
        target:
          averageUtilization: 30
          type: Utilization
      type: Resource



Thanks!

On Wed, Mar 2, 2022 at 12:15 AM Liam Clarke-Hutchinson <lc...@redhat.com>
wrote:

> Hi David,
>
> Scaling on CPU can be fine, what you scale on depends on what resource
> constrains your consuming application. CPU is a good proxy for "I'm working
> really hard", so not a bad one to start with.
>
> Main thing to be aware of is tuning the HPA to minimise scaling that causes
> "stop-the-world" consumer group rebalances, the documentation I linked
> earlier offers good advice. But you'll need to determine what is the best
> way to configure your HPA based on your particular workloads - in other
> words, a lot of trial and error. :)
>
> In terms of "everything is tied to partition number", there is an obvious
> upper limit when scaling consumers in a consumer group - if you have 20
> partitions on a topic, a consumer group consuming from that topic will only
> increase throughput when scaling up to 20 instances. If you have 30
> instances, 10 instances won't be assigned partitions unless some of the
> other instances fail.
>
> However, the real advantage of an HPA is in reducing cost / load,
> especially in a cloud environment - if the throughput on a given topic is
> low, and one consumer can easily handle all 20 partitions, then you're
> wasting money running 19 other instances. But if throughput suddenly
> increases, the HPA will let your consumer instances scale up automatically,
> and then scal down when the throughput drops again.
>
> It really depends on how throughput on your topic varies - if you're
> working in a domain where throughtput shows high seasonality over the day
> (e.g., at 4am in the morning, no-one is using your website, at 8pm,
> everyone is using it) then an HPA approach is ideal. But, as I said, you'll
> need to tune how your HPA scales to prevent repeated scaling up and down
> that interferes with the consumer group over all.
>
> If you have any more details on what problem you're trying to solve, I
> might be able to give more specific advice.
>
> TL;DR - I've found using HPAs to scale applications in the same consumer
> group is very useful, but it needs to be tuned to minimise scaling that can
> cause pauses in consumption.
>
> Kind regards,
>
> Liam Clarke-Hutchinson
>
>
>
> On Wed, 2 Mar 2022 at 13:14, David Ballano Fernandez <
> dfernandez@demonware.net> wrote:
>
> > Thanks Liam,
> >
> > I am trying hpa but using cpu utilization, but since everything is tied
> to
> > partition number etc i wonder what the benefits of running on hpa really
> > are.
> >
> > thanks!
> >
> > On Mon, Feb 28, 2022 at 12:59 PM Liam Clarke-Hutchinson <
> > lclarkeh@redhat.com>
> > wrote:
> >
> > > I've used HPAs scaling on lag before by feeding lag metrics from
> > Prometheus
> > > into the K8s metrics server as custom metrics.
> > >
> > > That said, you need to carefully control scaling frequency to avoid
> > > excessive consumer group rebalances. The cooperative sticky assignor
> can
> > > minimise pauses, but not remove them entirely.
> > >
> > > There's a lot of knobs you can use to tune HPAs these days:
> > >
> > >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__kubernetes.io_docs_tasks_run-2Dapplication_horizontal-2Dpod-2Dautoscale_-23configurable-2Dscaling-2Dbehavior&d=DwIBaQ&c=qE8EibqjfXM-zBfebVhd4gtjNZbrDcrKYXvb1gt38s4&r=p-f3AJg4e4Uk20g_16kSyBtabT4JOB-1GIb23_CxD58&m=dQzp4x9JZe-7YZcgrSl3YrB3X7PYTM_bS4caOQ59hLLonNXE0x3TveYTXVAFcxco&s=_NU3o8FG8CwNpe8wl3mVxXkNeEx_9aCD2_md1riEZa0&e=
> > >
> > > Good luck :)
> > >
> > >
> > >
> > > On Tue, 1 Mar 2022 at 08:49, David Ballano Fernandez <
> > > dfernandez@demonware.net> wrote:
> > >
> > > > Hello Guys,
> > > >
> > > > I was wondering how you guys do autoscaling of you consumers in
> > > kubernetes
> > > > if you do any.
> > > >
> > > > We have a mirrormaker-like app that mirrors data from cluster to
> > cluster
> > > at
> > > > the same time does some topic routing.  I would like to add hpa to
> the
> > > app
> > > > in order to scale up/down depending on avg cpu. but as you know  a
> > > consumer
> > > > app has lots of variables being partitions of topics being consumed
> a
> > > > pretty important one.
> > > >
> > > > Since kubernetes checks cpu avg, there are chances that
> pods/consumers
> > > > won't be scaled up to the  number of partitions possibly creating
> some
> > > hot
> > > > spots.
> > > >
> > > > Anyways i would like to know how you deal if you do at all with this.
> > > >
> > > > thanks!
> > > >
> > >
> >
>

Re: consumer hpa autoscaling

Posted by Liam Clarke-Hutchinson <lc...@redhat.com>.

Hi David,

Scaling on CPU can be fine, what you scale on depends on what resource
constrains your consuming application. CPU is a good proxy for "I'm working
really hard", so not a bad one to start with.

Main thing to be aware of is tuning the HPA to minimise scaling that causes
"stop-the-world" consumer group rebalances, the documentation I linked
earlier offers good advice. But you'll need to determine what is the best
way to configure your HPA based on your particular workloads - in other
words, a lot of trial and error. :)

In terms of "everything is tied to partition number", there is an obvious
upper limit when scaling consumers in a consumer group - if you have 20
partitions on a topic, a consumer group consuming from that topic will only
increase throughput when scaling up to 20 instances. If you have 30
instances, 10 instances won't be assigned partitions unless some of the
other instances fail.

However, the real advantage of an HPA is in reducing cost / load,
especially in a cloud environment - if the throughput on a given topic is
low, and one consumer can easily handle all 20 partitions, then you're
wasting money running 19 other instances. But if throughput suddenly
increases, the HPA will let your consumer instances scale up automatically,
and then scal down when the throughput drops again.

It really depends on how throughput on your topic varies - if you're
working in a domain where throughtput shows high seasonality over the day
(e.g., at 4am in the morning, no-one is using your website, at 8pm,
everyone is using it) then an HPA approach is ideal. But, as I said, you'll
need to tune how your HPA scales to prevent repeated scaling up and down
that interferes with the consumer group over all.

If you have any more details on what problem you're trying to solve, I
might be able to give more specific advice.

TL;DR - I've found using HPAs to scale applications in the same consumer
group is very useful, but it needs to be tuned to minimise scaling that can
cause pauses in consumption.

Kind regards,

Liam Clarke-Hutchinson

On Wed, 2 Mar 2022 at 13:14, David Ballano Fernandez <
dfernandez@demonware.net> wrote:

> Thanks Liam,
>
> I am trying hpa but using cpu utilization, but since everything is tied to
> partition number etc i wonder what the benefits of running on hpa really
> are.
>
> thanks!
>
> On Mon, Feb 28, 2022 at 12:59 PM Liam Clarke-Hutchinson <
> lclarkeh@redhat.com>
> wrote:
>
> > I've used HPAs scaling on lag before by feeding lag metrics from
> Prometheus
> > into the K8s metrics server as custom metrics.
> >
> > That said, you need to carefully control scaling frequency to avoid
> > excessive consumer group rebalances. The cooperative sticky assignor can
> > minimise pauses, but not remove them entirely.
> >
> > There's a lot of knobs you can use to tune HPAs these days:
> >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__kubernetes.io_docs_tasks_run-2Dapplication_horizontal-2Dpod-2Dautoscale_-23configurable-2Dscaling-2Dbehavior&d=DwIBaQ&c=qE8EibqjfXM-zBfebVhd4gtjNZbrDcrKYXvb1gt38s4&r=p-f3AJg4e4Uk20g_16kSyBtabT4JOB-1GIb23_CxD58&m=dQzp4x9JZe-7YZcgrSl3YrB3X7PYTM_bS4caOQ59hLLonNXE0x3TveYTXVAFcxco&s=_NU3o8FG8CwNpe8wl3mVxXkNeEx_9aCD2_md1riEZa0&e=
> >
> > Good luck :)
> >
> >
> >
> > On Tue, 1 Mar 2022 at 08:49, David Ballano Fernandez <
> > dfernandez@demonware.net> wrote:
> >
> > > Hello Guys,
> > >
> > > I was wondering how you guys do autoscaling of you consumers in
> > kubernetes
> > > if you do any.
> > >
> > > We have a mirrormaker-like app that mirrors data from cluster to
> cluster
> > at
> > > the same time does some topic routing.  I would like to add hpa to the
> > app
> > > in order to scale up/down depending on avg cpu. but as you know  a
> > consumer
> > > app has lots of variables being partitions of topics being consumed  a
> > > pretty important one.
> > >
> > > Since kubernetes checks cpu avg, there are chances that pods/consumers
> > > won't be scaled up to the  number of partitions possibly creating some
> > hot
> > > spots.
> > >
> > > Anyways i would like to know how you deal if you do at all with this.
> > >
> > > thanks!
> > >
> >
>

Re: consumer hpa autoscaling

Posted by Fares Oueslati <ou...@gmail.com>.

Hello

You can look at keda.sh which allows to auto scale workloads on kubernetes
according to many custom metrics including Kafka lag for instance.

Fares

Le mer. 2 mars 2022 à 01:07, David Ballano Fernandez <
dfernandez@demonware.net> a écrit :

> Thanks Liam,
>
> I am trying hpa but using cpu utilization, but since everything is tied to
> partition number etc i wonder what the benefits of running on hpa really
> are.
>
> thanks!
>
> On Mon, Feb 28, 2022 at 12:59 PM Liam Clarke-Hutchinson <
> lclarkeh@redhat.com>
> wrote:
>
> > I've used HPAs scaling on lag before by feeding lag metrics from
> Prometheus
> > into the K8s metrics server as custom metrics.
> >
> > That said, you need to carefully control scaling frequency to avoid
> > excessive consumer group rebalances. The cooperative sticky assignor can
> > minimise pauses, but not remove them entirely.
> >
> > There's a lot of knobs you can use to tune HPAs these days:
> >
> >
> https://urldefense.proofpoint.com/v2/url?u=https-3A__kubernetes.io_docs_tasks_run-2Dapplication_horizontal-2Dpod-2Dautoscale_-23configurable-2Dscaling-2Dbehavior&d=DwIBaQ&c=qE8EibqjfXM-zBfebVhd4gtjNZbrDcrKYXvb1gt38s4&r=p-f3AJg4e4Uk20g_16kSyBtabT4JOB-1GIb23_CxD58&m=dQzp4x9JZe-7YZcgrSl3YrB3X7PYTM_bS4caOQ59hLLonNXE0x3TveYTXVAFcxco&s=_NU3o8FG8CwNpe8wl3mVxXkNeEx_9aCD2_md1riEZa0&e=
> >
> > Good luck :)
> >
> >
> >
> > On Tue, 1 Mar 2022 at 08:49, David Ballano Fernandez <
> > dfernandez@demonware.net> wrote:
> >
> > > Hello Guys,
> > >
> > > I was wondering how you guys do autoscaling of you consumers in
> > kubernetes
> > > if you do any.
> > >
> > > We have a mirrormaker-like app that mirrors data from cluster to
> cluster
> > at
> > > the same time does some topic routing.  I would like to add hpa to the
> > app
> > > in order to scale up/down depending on avg cpu. but as you know  a
> > consumer
> > > app has lots of variables being partitions of topics being consumed  a
> > > pretty important one.
> > >
> > > Since kubernetes checks cpu avg, there are chances that pods/consumers
> > > won't be scaled up to the  number of partitions possibly creating some
> > hot
> > > spots.
> > >
> > > Anyways i would like to know how you deal if you do at all with this.
> > >
> > > thanks!
> > >
> >
>