You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Niklas Ström <st...@gmail.com> on 2016/12/08 15:44:13 UTC

Configuration for low latency and low cpu utilization? java/librdkafka

Use case scenario:
We want to have a fairly low latency, say below 20 ms, and we want to be
able to run a few hundred processes (on one machine) both producing and
consuming a handful of topics. The throughput is not high, lets say on
average 10 messages per second for each process. Most messages are 50-500
bytes large, some may be a few kbytes.

How should we adjust the configuration parameters for our use case?

Our experiments so far gives us a good latency but at the expence of CPU
utilization. Even with a bad latency, the CPU utilization is not
satisfying. Since we will have a lot of processes we are concerned that
short poll loops will cause an overconsumption of CPU capacity. We are
hoping we might have missed some configuration parameter or that we have
some issues with our environment that we can find and solve.

We are using both the java client and librdkafka and see similar CPU issues
in both clients.

We have looked at recommendations from:
https://github.com/edenhill/librdkafka/wiki/How-to-decrease-message-latency
The only thing that seems to really make a difference for librdkafka is
socket.blocking.max.ms, but reducing that also makes the CPU go up.

I would really appreciate input on configuration parameters and of any
experience with environment issues that has caused CPU load. Or is our
scenario not feasible at all?

Cheers

Re: Configuration for low latency and low cpu utilization? java/librdkafka

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
On the producer side, there's not much you can do to reduce CPU usage if
you want low latency and don't have enough throughput to buffer multiple
messages -- you're going to end up sending 1 record at a time in order to
achieve your desired latency. Note, however, that the producer is thread
safe, so if it is possible to combine multiple processes into a single
multi-threaded app, you might be able to share a single producer and get
better batching.

One the consumer side, for the Java client fetch.min.bytes is already set
to 1, which will minimize latency -- data will be returned as soon as any
data is available. If you are consistently seeing poll() return no messages
in your consumers, try increasing fetch.max.wait.ms. It defaults to 500ms,
so I'm guessing you're not hitting this, but if your data is spread across
enough partitions and brokers, it's possible you are sending out a bunch of
fetch requests that aren't returning any data.

Also, as with producers, if you have light enough traffic you will benefit
by consolidating to fewer consumers if possible. Fetch requests are made
one at a time for *all* partitions the consumer is reading from that have
the same leader, which means you'll amortize the cost of requests over
multiple topic partitions (while maintaining the low latency guarantees
when traffic in all the partitions is light anyway).

Finally, as always, your best bet is to measure metrics & profile your app
to see where the CPU time is going.

-Ewen

On Thu, Dec 8, 2016 at 7:44 AM, Niklas Ström <st...@gmail.com> wrote:

> Use case scenario:
> We want to have a fairly low latency, say below 20 ms, and we want to be
> able to run a few hundred processes (on one machine) both producing and
> consuming a handful of topics. The throughput is not high, lets say on
> average 10 messages per second for each process. Most messages are 50-500
> bytes large, some may be a few kbytes.
>
> How should we adjust the configuration parameters for our use case?
>
> Our experiments so far gives us a good latency but at the expence of CPU
> utilization. Even with a bad latency, the CPU utilization is not
> satisfying. Since we will have a lot of processes we are concerned that
> short poll loops will cause an overconsumption of CPU capacity. We are
> hoping we might have missed some configuration parameter or that we have
> some issues with our environment that we can find and solve.
>
> We are using both the java client and librdkafka and see similar CPU issues
> in both clients.
>
> We have looked at recommendations from:
> https://github.com/edenhill/librdkafka/wiki/How-to-
> decrease-message-latency
> The only thing that seems to really make a difference for librdkafka is
> socket.blocking.max.ms, but reducing that also makes the CPU go up.
>
> I would really appreciate input on configuration parameters and of any
> experience with environment issues that has caused CPU load. Or is our
> scenario not feasible at all?
>
> Cheers
>



-- 
Thanks,
Ewen