You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Felipe Santos <fe...@gmail.com> on 2016/11/22 12:00:01 UTC

Oversized Message 40k

I read on documentation that kafka is not optimized for big messages, what
is considered a big message?

For us the messages will be on average from 20k ~ 40k? Is this a real
problem?

Thanks
-- 
Felipe Santos

Re: Oversized Message 40k

Posted by Felipe Santos <fe...@gmail.com>.
Thanks guys, for your information, I will do some performance tests

Em qua, 23 de nov de 2016 às 05:14, Ignacio Solis <is...@igso.net>
escreveu:

> At LinkedIn we have a number of use cases for large messages.  We stick to
> the 1MB message limit at the high end though.
>
> Nacho
>
> On Tue, Nov 22, 2016 at 6:11 PM, Gwen Shapira <gw...@confluent.io> wrote:
>
> > This has been our experience as well. I think the largest we've seen
> > in production is 50MB.
> >
> > If you have performance numbers you can share for the large messages,
> > I think we'll all appreciate :)
> >
> > On Tue, Nov 22, 2016 at 1:04 PM, Tauzell, Dave
> > <Da...@surescripts.com> wrote:
> > > I ran tests with a mix of messages, some as large as 20MB.   These
> large
> > messages do slow down processing, but it still works.
> > >
> > > -Dave
> > >
> > > -----Original Message-----
> > > From: hans@confluent.io [mailto:hans@confluent.io]
> > > Sent: Tuesday, November 22, 2016 1:41 PM
> > > To: users@kafka.apache.org
> > > Subject: Re: Oversized Message 40k
> > >
> > > The default config handles messages up to 1MB so you should be fine.
> > >
> > > -hans
> > >
> > >> On Nov 22, 2016, at 4:00 AM, Felipe Santos <fe...@gmail.com>
> wrote:
> > >>
> > >> I read on documentation that kafka is not optimized for big messages,
> > >> what is considered a big message?
> > >>
> > >> For us the messages will be on average from 20k ~ 40k? Is this a real
> > >> problem?
> > >>
> > >> Thanks
> > >> --
> > >> Felipe Santos
> > > This e-mail and any files transmitted with it are confidential, may
> > contain sensitive information, and are intended solely for the use of the
> > individual or entity to whom they are addressed. If you have received
> this
> > e-mail in error, please notify the sender by reply e-mail immediately and
> > destroy all copies of the e-mail and any attachments.
> >
> >
> >
> > --
> > Gwen Shapira
> > Product Manager | Confluent
> > 650.450.2760 | @gwenshap
> > Follow us: Twitter | blog
> >
>
>
>
> --
> Nacho - Ignacio Solis - isolis@igso.net
>

Re: Oversized Message 40k

Posted by Ignacio Solis <is...@igso.net>.
At LinkedIn we have a number of use cases for large messages.  We stick to
the 1MB message limit at the high end though.

Nacho

On Tue, Nov 22, 2016 at 6:11 PM, Gwen Shapira <gw...@confluent.io> wrote:

> This has been our experience as well. I think the largest we've seen
> in production is 50MB.
>
> If you have performance numbers you can share for the large messages,
> I think we'll all appreciate :)
>
> On Tue, Nov 22, 2016 at 1:04 PM, Tauzell, Dave
> <Da...@surescripts.com> wrote:
> > I ran tests with a mix of messages, some as large as 20MB.   These large
> messages do slow down processing, but it still works.
> >
> > -Dave
> >
> > -----Original Message-----
> > From: hans@confluent.io [mailto:hans@confluent.io]
> > Sent: Tuesday, November 22, 2016 1:41 PM
> > To: users@kafka.apache.org
> > Subject: Re: Oversized Message 40k
> >
> > The default config handles messages up to 1MB so you should be fine.
> >
> > -hans
> >
> >> On Nov 22, 2016, at 4:00 AM, Felipe Santos <fe...@gmail.com> wrote:
> >>
> >> I read on documentation that kafka is not optimized for big messages,
> >> what is considered a big message?
> >>
> >> For us the messages will be on average from 20k ~ 40k? Is this a real
> >> problem?
> >>
> >> Thanks
> >> --
> >> Felipe Santos
> > This e-mail and any files transmitted with it are confidential, may
> contain sensitive information, and are intended solely for the use of the
> individual or entity to whom they are addressed. If you have received this
> e-mail in error, please notify the sender by reply e-mail immediately and
> destroy all copies of the e-mail and any attachments.
>
>
>
> --
> Gwen Shapira
> Product Manager | Confluent
> 650.450.2760 | @gwenshap
> Follow us: Twitter | blog
>



-- 
Nacho - Ignacio Solis - isolis@igso.net

RE: Oversized Message 40k

Posted by "Tauzell, Dave" <Da...@surescripts.com>.
> If you have performance numbers you can share for the large messages, I think we'll all appreciate :)

Here is a sample test run:

3 brokers: 256G memory, 32 cores
3 zookeeper on smaller VMs
15 topics, replication: 3, partitions: 3	
I had two JMeter instances sending "blobs" of random XML spread out between the 15 topics
The first had messages up to 1MB using  snappy compression using this ratio:
	3K	95.6
	8K	2
	50K	1
	200k	.07
	1m	.04

The second had 10/25 MB messages and sent them at a rate of 3 tps using snappy compression

I used 2 kafka-console-consumer instances to read the data

I setup Jmeter to start with 500  threads and setup 100 every 2 minutes up to 2000 threads. The throughput peaked at about 13,400 at 1400 threads.

The consumers were able to keep up.  I had the consumers running on the zookeeper servers.

-Dave

-----Original Message-----
From: Gwen Shapira [mailto:gwen@confluent.io] 
Sent: Tuesday, November 22, 2016 8:11 PM
To: Users
Subject: Re: Oversized Message 40k

This has been our experience as well. I think the largest we've seen in production is 50MB.

If you have performance numbers you can share for the large messages, I think we'll all appreciate :)

On Tue, Nov 22, 2016 at 1:04 PM, Tauzell, Dave <Da...@surescripts.com> wrote:
> I ran tests with a mix of messages, some as large as 20MB.   These large messages do slow down processing, but it still works.
>
> -Dave
>
> -----Original Message-----
> From: hans@confluent.io [mailto:hans@confluent.io]
> Sent: Tuesday, November 22, 2016 1:41 PM
> To: users@kafka.apache.org
> Subject: Re: Oversized Message 40k
>
> The default config handles messages up to 1MB so you should be fine.
>
> -hans
>
>> On Nov 22, 2016, at 4:00 AM, Felipe Santos <fe...@gmail.com> wrote:
>>
>> I read on documentation that kafka is not optimized for big messages, 
>> what is considered a big message?
>>
>> For us the messages will be on average from 20k ~ 40k? Is this a real 
>> problem?
>>
>> Thanks
>> --
>> Felipe Santos
> This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.



--
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

Re: Oversized Message 40k

Posted by Gwen Shapira <gw...@confluent.io>.
This has been our experience as well. I think the largest we've seen
in production is 50MB.

If you have performance numbers you can share for the large messages,
I think we'll all appreciate :)

On Tue, Nov 22, 2016 at 1:04 PM, Tauzell, Dave
<Da...@surescripts.com> wrote:
> I ran tests with a mix of messages, some as large as 20MB.   These large messages do slow down processing, but it still works.
>
> -Dave
>
> -----Original Message-----
> From: hans@confluent.io [mailto:hans@confluent.io]
> Sent: Tuesday, November 22, 2016 1:41 PM
> To: users@kafka.apache.org
> Subject: Re: Oversized Message 40k
>
> The default config handles messages up to 1MB so you should be fine.
>
> -hans
>
>> On Nov 22, 2016, at 4:00 AM, Felipe Santos <fe...@gmail.com> wrote:
>>
>> I read on documentation that kafka is not optimized for big messages,
>> what is considered a big message?
>>
>> For us the messages will be on average from 20k ~ 40k? Is this a real
>> problem?
>>
>> Thanks
>> --
>> Felipe Santos
> This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.



-- 
Gwen Shapira
Product Manager | Confluent
650.450.2760 | @gwenshap
Follow us: Twitter | blog

RE: Oversized Message 40k

Posted by "Tauzell, Dave" <Da...@surescripts.com>.
I ran tests with a mix of messages, some as large as 20MB.   These large messages do slow down processing, but it still works.

-Dave

-----Original Message-----
From: hans@confluent.io [mailto:hans@confluent.io]
Sent: Tuesday, November 22, 2016 1:41 PM
To: users@kafka.apache.org
Subject: Re: Oversized Message 40k

The default config handles messages up to 1MB so you should be fine.

-hans

> On Nov 22, 2016, at 4:00 AM, Felipe Santos <fe...@gmail.com> wrote:
>
> I read on documentation that kafka is not optimized for big messages,
> what is considered a big message?
>
> For us the messages will be on average from 20k ~ 40k? Is this a real
> problem?
>
> Thanks
> --
> Felipe Santos
This e-mail and any files transmitted with it are confidential, may contain sensitive information, and are intended solely for the use of the individual or entity to whom they are addressed. If you have received this e-mail in error, please notify the sender by reply e-mail immediately and destroy all copies of the e-mail and any attachments.

Re: Oversized Message 40k

Posted by ha...@confluent.io.
The default config handles messages up to 1MB so you should be fine.

-hans

> On Nov 22, 2016, at 4:00 AM, Felipe Santos <fe...@gmail.com> wrote:
> 
> I read on documentation that kafka is not optimized for big messages, what
> is considered a big message?
> 
> For us the messages will be on average from 20k ~ 40k? Is this a real
> problem?
> 
> Thanks
> -- 
> Felipe Santos

Re: Oversized Message 40k

Posted by Dominik Safaric <do...@gmail.com>.
Big is a relative term.

And the question you ask is quite difficult to answer because not other information is available - including the configuration of the Kafka cluster, hardware specification etcetera. 

I suggest the following: (1) read a couple of benchmarks such as [1], (2) investigate onto this question of matter yourself by designing and running a microbenchmark - since you’re the only one who knows what throughput or for example latency you’d like to achieve.

But in general, I do believe that those messages are larger then the expected. Hence, you may also want to take a look at compression in Kafka - although you’ll lose performance.  

[1] https://engineering.linkedin.com/kafka/benchmarking-apache-kafka-2-million-writes-second-three-cheap-machines

> On 22 Nov 2016, at 13:00, Felipe Santos <fe...@gmail.com> wrote:
> 
> I read on documentation that kafka is not optimized for big messages, what
> is considered a big message?
> 
> For us the messages will be on average from 20k ~ 40k? Is this a real
> problem?
> 
> Thanks
> -- 
> Felipe Santos