You are viewing a plain text version of this content. The canonical link for it is here.

Posted to users@kafka.apache.org by John Yost <ho...@gmail.com> on 2016/02/18 14:18:04 UTC

Does kafka.common.QueueFullException indicate back pressure in Kafka?

Hi Everyone,

I am encountering this exception similar to Saurabh's report earlier today
as I try to scale up a Storm -> Kafka output via the KafkaBolt (i.e., add
more KafkaBolt executors).

Question...does this necessarily indicate back pressure from Kafka where
the Kafka writes cannot keep up with the incoming messages sent over by
Storm? If so, do I add brokers to the cluster, do I add more topics, a
combo thereof or something else?

As always, any thoughts from people who know more than I do are
appreciated. :)

Thanks

--John

Re: Does kafka.common.QueueFullException indicate back pressure in Kafka?

Posted by Alex Loddengaard <al...@confluent.io>.

Hi John,

I'm glad the info was helpful.

It's hard to diagnose this issue without monitoring. I suggest setting up
graphite to graph JMX metrics. There's a good (not designed for production)
script here (as part of a Vagrant VM):
https://github.com/gwenshap/ops_training_vm/blob/master/bootstrap.sh

Once monitoring is setup, see where the brokers spend their time servicing
a request. More producers will put more load on the brokers and, depending
on configuration, can cause requests to be ACK'd slower, which would cause
the producers to buffer more. Again, this is all very configuration
specific so it's hard to give you specific advice.

Alex


On Sat, Feb 20, 2016 at 2:59 AM, John Yost <ho...@gmail.com> wrote:

> Hi Alex,
>
> Excellent information, thanks! I very much appreciate your time.  BTW,
> Kafka is an EXCELLENT product.
>
> It seems like my situation may be a bit of an edge case, based upon your
> response. Specifically, when I added more producers (in the case of Storm,
> a Kafka producer is a KafkaBolt), that is when the QueueFullExceptions were
> thrown. In one instance, I went from 30 KafkaBolts (producers) to 60, and,
> after about 45 minutes or so, I started seeing QueueFullExceptions.  This
> is why I am wondering if these exceptions can also be a symptom of back
> pressure from Kafka.
>
> Is this plausible?
>
> --John
>
>
>
> I have tuned the producers
>
> On Thu, Feb 18, 2016 at 3:59 PM, Alex Loddengaard <al...@confluent.io>
> wrote:
>
> > Hi John,
> >
> > I should preface this by saying I've never used Storm and KafkaBolt and
> am
> > not a streaming expert.
> >
> > However, if you're running out of buffer in the producer (as is what's
> > happening in the other thread you referenced), you can possibly alleviate
> > this by adding more producers, or by tuning the producers. Tuning the
> > brokers or adding more brokers may help as well, but it's hard to say for
> > sure without looking at your monitors and knowing more about the use case
> > and cluster.
> >
> > I suggest setting up monitoring and looking deeply at the JMX metrics
> that
> > are created to understand where each message spends most of its time
> > (producer, broker, consumer to start, then request queues, io threads,
> > etc). The docs go through each JMX metric relevant here. Then from there
> > you can start understanding how to alleviate the problem.
> >
> > Feel free to share metrics and more information and we can explore them
> > together.
> >
> > Alex
> >
> > On Thu, Feb 18, 2016 at 5:18 AM, John Yost <ho...@gmail.com> wrote:
> >
> > > Hi Everyone,
> > >
> > > I am encountering this exception similar to Saurabh's report earlier
> > today
> > > as I try to scale up a Storm -> Kafka output via the KafkaBolt (i.e.,
> add
> > > more KafkaBolt executors).
> > >
> > > Question...does this necessarily indicate back pressure from Kafka
> where
> > > the Kafka writes cannot keep up with the incoming messages sent over by
> > > Storm? If so, do I add brokers to the cluster, do I add more topics, a
> > > combo thereof or something else?
> > >
> > > As always, any thoughts from people who know more than I do are
> > > appreciated. :)
> > >
> > > Thanks
> > >
> > > --John
> > >
> >
> >
> >
> > --
> > *Alex Loddengaard | **Solutions Architect | Confluent*
> > *Download Apache Kafka and Confluent Platform: www.confluent.io/download
> > <http://www.confluent.io/download>*
> >
>



-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*

Re: Does kafka.common.QueueFullException indicate back pressure in Kafka?

Posted by John Yost <ho...@gmail.com>.

Hi Alex,

Excellent information, thanks! I very much appreciate your time.  BTW,
Kafka is an EXCELLENT product.

It seems like my situation may be a bit of an edge case, based upon your
response. Specifically, when I added more producers (in the case of Storm,
a Kafka producer is a KafkaBolt), that is when the QueueFullExceptions were
thrown. In one instance, I went from 30 KafkaBolts (producers) to 60, and,
after about 45 minutes or so, I started seeing QueueFullExceptions.  This
is why I am wondering if these exceptions can also be a symptom of back
pressure from Kafka.

Is this plausible?

--John



I have tuned the producers

On Thu, Feb 18, 2016 at 3:59 PM, Alex Loddengaard <al...@confluent.io> wrote:

> Hi John,
>
> I should preface this by saying I've never used Storm and KafkaBolt and am
> not a streaming expert.
>
> However, if you're running out of buffer in the producer (as is what's
> happening in the other thread you referenced), you can possibly alleviate
> this by adding more producers, or by tuning the producers. Tuning the
> brokers or adding more brokers may help as well, but it's hard to say for
> sure without looking at your monitors and knowing more about the use case
> and cluster.
>
> I suggest setting up monitoring and looking deeply at the JMX metrics that
> are created to understand where each message spends most of its time
> (producer, broker, consumer to start, then request queues, io threads,
> etc). The docs go through each JMX metric relevant here. Then from there
> you can start understanding how to alleviate the problem.
>
> Feel free to share metrics and more information and we can explore them
> together.
>
> Alex
>
> On Thu, Feb 18, 2016 at 5:18 AM, John Yost <ho...@gmail.com> wrote:
>
> > Hi Everyone,
> >
> > I am encountering this exception similar to Saurabh's report earlier
> today
> > as I try to scale up a Storm -> Kafka output via the KafkaBolt (i.e., add
> > more KafkaBolt executors).
> >
> > Question...does this necessarily indicate back pressure from Kafka where
> > the Kafka writes cannot keep up with the incoming messages sent over by
> > Storm? If so, do I add brokers to the cluster, do I add more topics, a
> > combo thereof or something else?
> >
> > As always, any thoughts from people who know more than I do are
> > appreciated. :)
> >
> > Thanks
> >
> > --John
> >
>
>
>
> --
> *Alex Loddengaard | **Solutions Architect | Confluent*
> *Download Apache Kafka and Confluent Platform: www.confluent.io/download
> <http://www.confluent.io/download>*
>

Re: Does kafka.common.QueueFullException indicate back pressure in Kafka?

Posted by Alex Loddengaard <al...@confluent.io>.

Hi John,

I should preface this by saying I've never used Storm and KafkaBolt and am
not a streaming expert.

However, if you're running out of buffer in the producer (as is what's
happening in the other thread you referenced), you can possibly alleviate
this by adding more producers, or by tuning the producers. Tuning the
brokers or adding more brokers may help as well, but it's hard to say for
sure without looking at your monitors and knowing more about the use case
and cluster.

I suggest setting up monitoring and looking deeply at the JMX metrics that
are created to understand where each message spends most of its time
(producer, broker, consumer to start, then request queues, io threads,
etc). The docs go through each JMX metric relevant here. Then from there
you can start understanding how to alleviate the problem.

Feel free to share metrics and more information and we can explore them
together.

Alex

On Thu, Feb 18, 2016 at 5:18 AM, John Yost <ho...@gmail.com> wrote:

> Hi Everyone,
>
> I am encountering this exception similar to Saurabh's report earlier today
> as I try to scale up a Storm -> Kafka output via the KafkaBolt (i.e., add
> more KafkaBolt executors).
>
> Question...does this necessarily indicate back pressure from Kafka where
> the Kafka writes cannot keep up with the incoming messages sent over by
> Storm? If so, do I add brokers to the cluster, do I add more topics, a
> combo thereof or something else?
>
> As always, any thoughts from people who know more than I do are
> appreciated. :)
>
> Thanks
>
> --John
>

-- 
*Alex Loddengaard | **Solutions Architect | Confluent*
*Download Apache Kafka and Confluent Platform: www.confluent.io/download
<http://www.confluent.io/download>*