You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Kris K <sq...@gmail.com> on 2015/06/23 20:18:14 UTC

high level consumer memory footprint

Hi,

I was just wondering if there is any difference in the memory footprint of
a high level consumer when:

1. the consumer is live and continuously consuming messages with no backlogs
2. when the consumer is down for quite some time and needs to be brought up
to clear the backlog.

My test case with kafka 0.8.2.1 using only one topic has:

Setup: 6 brokers and 3 zookeeper nodes
Message Size: 1 MB
Producer rate: 100 threads with 1000 messages per thread
No. of partitions in topic: 100
Consumer threads: 100 consumer threads in the same group

I initially started producer and consumer on the same java process with a
heap size 1 GB. The producer could send all the messages to broker. But the
consumer started throwing OutOfMemory exceptions after consuming 26k
messages.

Upon restarting the process with 5 GB heap, the consumer consumed around
4.8k messages before going OOM (while clearing a backlog of around 74k).
The rest of the messages got consumed when I bumped up heap to 10 GB.

On the consumer, I have the default values for fetch.message.max.bytes and
queued.max.message.chunks.

If the calculation
(fetch.message.max.bytes)*(queued.max.message.chunks)*(no. of consumer
threads) holds good for consumer, then 1024*1024*10*100 (close to 1GB) is
well below the 5GB heap allocated. Did I leave something out of this
calculation?


Regards,
Kris

Re: high level consumer memory footprint

Posted by Kris K <sq...@gmail.com>.
Hi,

I found that the consumer config param fetch.message.max.bytes is set to
100 MB on the consumer and I think this is what caused the problem.

It would really be helpful if anyone can explain how much memory the
consumer (running 100 threads) is going to need for consuming 100K 1 MB
messages from 100 partitions.

Thanks,
Kris

On Tue, Jun 23, 2015 at 11:18 AM, Kris K <sq...@gmail.com> wrote:

> Hi,
>
> I was just wondering if there is any difference in the memory footprint of
> a high level consumer when:
>
> 1. the consumer is live and continuously consuming messages with no
> backlogs
> 2. when the consumer is down for quite some time and needs to be brought
> up to clear the backlog.
>
> My test case with kafka 0.8.2.1 using only one topic has:
>
> Setup: 6 brokers and 3 zookeeper nodes
> Message Size: 1 MB
> Producer rate: 100 threads with 1000 messages per thread
> No. of partitions in topic: 100
> Consumer threads: 100 consumer threads in the same group
>
> I initially started producer and consumer on the same java process with a
> heap size 1 GB. The producer could send all the messages to broker. But the
> consumer started throwing OutOfMemory exceptions after consuming 26k
> messages.
>
> Upon restarting the process with 5 GB heap, the consumer consumed around
> 4.8k messages before going OOM (while clearing a backlog of around 74k).
> The rest of the messages got consumed when I bumped up heap to 10 GB.
>
> On the consumer, I have the default values for fetch.message.max.bytes and
> queued.max.message.chunks.
>
> If the calculation
> (fetch.message.max.bytes)*(queued.max.message.chunks)*(no. of consumer
> threads) holds good for consumer, then 1024*1024*10*100 (close to 1GB) is
> well below the 5GB heap allocated. Did I leave something out of this
> calculation?
>
>
> Regards,
> Kris
>
>

Re: high level consumer memory footprint

Posted by Kris K <sq...@gmail.com>.
Hi,

I found that the consumer config param fetch.message.max.bytes is set to
100 MB on the consumer and I think this is what caused the problem.

It would really be helpful if anyone can explain how much memory the
consumer (running 100 threads) is going to need for consuming 100K 1 MB
messages from 100 partitions.

Thanks,
Kris

On Tue, Jun 23, 2015 at 11:18 AM, Kris K <sq...@gmail.com> wrote:

> Hi,
>
> I was just wondering if there is any difference in the memory footprint of
> a high level consumer when:
>
> 1. the consumer is live and continuously consuming messages with no
> backlogs
> 2. when the consumer is down for quite some time and needs to be brought
> up to clear the backlog.
>
> My test case with kafka 0.8.2.1 using only one topic has:
>
> Setup: 6 brokers and 3 zookeeper nodes
> Message Size: 1 MB
> Producer rate: 100 threads with 1000 messages per thread
> No. of partitions in topic: 100
> Consumer threads: 100 consumer threads in the same group
>
> I initially started producer and consumer on the same java process with a
> heap size 1 GB. The producer could send all the messages to broker. But the
> consumer started throwing OutOfMemory exceptions after consuming 26k
> messages.
>
> Upon restarting the process with 5 GB heap, the consumer consumed around
> 4.8k messages before going OOM (while clearing a backlog of around 74k).
> The rest of the messages got consumed when I bumped up heap to 10 GB.
>
> On the consumer, I have the default values for fetch.message.max.bytes and
> queued.max.message.chunks.
>
> If the calculation
> (fetch.message.max.bytes)*(queued.max.message.chunks)*(no. of consumer
> threads) holds good for consumer, then 1024*1024*10*100 (close to 1GB) is
> well below the 5GB heap allocated. Did I leave something out of this
> calculation?
>
>
> Regards,
> Kris
>
>