You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Rajiv Kurian <ra...@signalfuse.com> on 2014/12/13 05:12:50 UTC

kafka async producer takes a lot of cpu

I am using the kafka java api async client (the one that wraps the Scala
client). It's dropping quite a bit of data due to the queue being full, and
the process ends up taking a lot of cpu too.

I am posting to a topic with 1024 partitions (across 3 brokers) - maybe the
high number of brokers is one of the issues. Profiling with YourKit showed
37% of my CPU being spent on kafka.producer.async.ProducerSendThread.run().
So seems like the producer is not able to keep up with my application and
starts dropping. When I expand this waterfall in YourKit, I see that 23%
total (not out of the 37%) is being spent on logging! Something like this:

kafka.producer.BrokerPartitionInfo$$anonfun$getBrokerPartitionInfo$2.apply(PartitionMetadata)
->kafka.producer.BrokerPartitionInfo.debug(Function0)
   ->kafka.utils.Logging$class.debug(Logging, Function0)
      ->org.apache.log4j.Category.isDebugEnabled()
         ->... (a bunch of other things that finally break down into)
             ->LoggerContext.java:252
ch.qos.logback.classic.spi.TurboFilterList.getTurboFilterChainDecision(Marker,
Logger, Level, String, Object[], Throwable)

I am not sure what's going on here. When I look at my process log, none of
these messages are actually logged (probably because of the log level).
Further I don't see anything very suspicious on the broker logs. They are
at 60-70% cpu.

I am planning to try the new Java beta producer client, but I am afraid
something deeper is going on here, that might not be solved by switching to
the newer client.

Re: kafka async producer takes a lot of cpu

Posted by Neha Narkhede <ne...@confluent.io>.
The stack trace you showed shouldn't be going beyond the isDebugEnabled()
check if your log4j properties is configured at a higher log level.
Regarding your producer dropping messages, have you checked the produce
metrics ("kafka.network":name={Produce-TotalTimeMs",type="RequestMetrics")
on the broker to see if the bottleneck is on the server or not?

On Sun, Dec 14, 2014 at 9:31 PM, Rajiv Kurian <ra...@signalfuse.com> wrote:
>
> I'll try it and let you guys know. Is there anything that comes to mind
> from these log messages though? Why would there be so many log messages?
> Would you suggest doing something else to find out why things are working
> so poorly? I am worried about making the risky transition to the beta
> producer client and finding out that things don't improve at all. Another
> thing that was peculiar was that only one of my producers (out of 3, each
> on a separate host) were dropping so many messages.
>
> On Sun, Dec 14, 2014 at 4:58 PM, Neha Narkhede <ne...@confluent.io> wrote:
>
> > Thanks for reporting the issue, Rajiv. Since we are actively phasing out
> > the old client, it will be very helpful to know what the behavior on the
> > new client is.
> >
> >
> > On Fri, Dec 12, 2014 at 8:12 PM, Rajiv Kurian <ra...@signalfuse.com>
> > wrote:
> > >
> > > I am using the kafka java api async client (the one that wraps the
> Scala
> > > client). It's dropping quite a bit of data due to the queue being full,
> > and
> > > the process ends up taking a lot of cpu too.
> > >
> > > I am posting to a topic with 1024 partitions (across 3 brokers) - maybe
> > the
> > > high number of brokers is one of the issues. Profiling with YourKit
> > showed
> > > 37% of my CPU being spent on
> > kafka.producer.async.ProducerSendThread.run().
> > > So seems like the producer is not able to keep up with my application
> and
> > > starts dropping. When I expand this waterfall in YourKit, I see that
> 23%
> > > total (not out of the 37%) is being spent on logging! Something like
> > this:
> > >
> > >
> > >
> >
> kafka.producer.BrokerPartitionInfo$$anonfun$getBrokerPartitionInfo$2.apply(PartitionMetadata)
> > > ->kafka.producer.BrokerPartitionInfo.debug(Function0)
> > >    ->kafka.utils.Logging$class.debug(Logging, Function0)
> > >       ->org.apache.log4j.Category.isDebugEnabled()
> > >          ->... (a bunch of other things that finally break down into)
> > >              ->LoggerContext.java:252
> > >
> > >
> >
> ch.qos.logback.classic.spi.TurboFilterList.getTurboFilterChainDecision(Marker,
> > > Logger, Level, String, Object[], Throwable)
> > >
> > > I am not sure what's going on here. When I look at my process log, none
> > of
> > > these messages are actually logged (probably because of the log level).
> > > Further I don't see anything very suspicious on the broker logs. They
> are
> > > at 60-70% cpu.
> > >
> > > I am planning to try the new Java beta producer client, but I am afraid
> > > something deeper is going on here, that might not be solved by
> switching
> > to
> > > the newer client.
> > >
> >
> >
> > --
> > Thanks,
> > Neha
> >
>


-- 
Thanks,
Neha

Re: kafka async producer takes a lot of cpu

Posted by Rajiv Kurian <ra...@signalfuse.com>.
I'll try it and let you guys know. Is there anything that comes to mind
from these log messages though? Why would there be so many log messages?
Would you suggest doing something else to find out why things are working
so poorly? I am worried about making the risky transition to the beta
producer client and finding out that things don't improve at all. Another
thing that was peculiar was that only one of my producers (out of 3, each
on a separate host) were dropping so many messages.

On Sun, Dec 14, 2014 at 4:58 PM, Neha Narkhede <ne...@confluent.io> wrote:

> Thanks for reporting the issue, Rajiv. Since we are actively phasing out
> the old client, it will be very helpful to know what the behavior on the
> new client is.
>
>
> On Fri, Dec 12, 2014 at 8:12 PM, Rajiv Kurian <ra...@signalfuse.com>
> wrote:
> >
> > I am using the kafka java api async client (the one that wraps the Scala
> > client). It's dropping quite a bit of data due to the queue being full,
> and
> > the process ends up taking a lot of cpu too.
> >
> > I am posting to a topic with 1024 partitions (across 3 brokers) - maybe
> the
> > high number of brokers is one of the issues. Profiling with YourKit
> showed
> > 37% of my CPU being spent on
> kafka.producer.async.ProducerSendThread.run().
> > So seems like the producer is not able to keep up with my application and
> > starts dropping. When I expand this waterfall in YourKit, I see that 23%
> > total (not out of the 37%) is being spent on logging! Something like
> this:
> >
> >
> >
> kafka.producer.BrokerPartitionInfo$$anonfun$getBrokerPartitionInfo$2.apply(PartitionMetadata)
> > ->kafka.producer.BrokerPartitionInfo.debug(Function0)
> >    ->kafka.utils.Logging$class.debug(Logging, Function0)
> >       ->org.apache.log4j.Category.isDebugEnabled()
> >          ->... (a bunch of other things that finally break down into)
> >              ->LoggerContext.java:252
> >
> >
> ch.qos.logback.classic.spi.TurboFilterList.getTurboFilterChainDecision(Marker,
> > Logger, Level, String, Object[], Throwable)
> >
> > I am not sure what's going on here. When I look at my process log, none
> of
> > these messages are actually logged (probably because of the log level).
> > Further I don't see anything very suspicious on the broker logs. They are
> > at 60-70% cpu.
> >
> > I am planning to try the new Java beta producer client, but I am afraid
> > something deeper is going on here, that might not be solved by switching
> to
> > the newer client.
> >
>
>
> --
> Thanks,
> Neha
>

Re: kafka async producer takes a lot of cpu

Posted by Neha Narkhede <ne...@confluent.io>.
Thanks for reporting the issue, Rajiv. Since we are actively phasing out
the old client, it will be very helpful to know what the behavior on the
new client is.


On Fri, Dec 12, 2014 at 8:12 PM, Rajiv Kurian <ra...@signalfuse.com> wrote:
>
> I am using the kafka java api async client (the one that wraps the Scala
> client). It's dropping quite a bit of data due to the queue being full, and
> the process ends up taking a lot of cpu too.
>
> I am posting to a topic with 1024 partitions (across 3 brokers) - maybe the
> high number of brokers is one of the issues. Profiling with YourKit showed
> 37% of my CPU being spent on kafka.producer.async.ProducerSendThread.run().
> So seems like the producer is not able to keep up with my application and
> starts dropping. When I expand this waterfall in YourKit, I see that 23%
> total (not out of the 37%) is being spent on logging! Something like this:
>
>
> kafka.producer.BrokerPartitionInfo$$anonfun$getBrokerPartitionInfo$2.apply(PartitionMetadata)
> ->kafka.producer.BrokerPartitionInfo.debug(Function0)
>    ->kafka.utils.Logging$class.debug(Logging, Function0)
>       ->org.apache.log4j.Category.isDebugEnabled()
>          ->... (a bunch of other things that finally break down into)
>              ->LoggerContext.java:252
>
> ch.qos.logback.classic.spi.TurboFilterList.getTurboFilterChainDecision(Marker,
> Logger, Level, String, Object[], Throwable)
>
> I am not sure what's going on here. When I look at my process log, none of
> these messages are actually logged (probably because of the log level).
> Further I don't see anything very suspicious on the broker logs. They are
> at 60-70% cpu.
>
> I am planning to try the new Java beta producer client, but I am afraid
> something deeper is going on here, that might not be solved by switching to
> the newer client.
>


-- 
Thanks,
Neha