You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Dhirendra Singh <dh...@gmail.com> on 2020/12/23 05:11:16 UTC

Producer batch size

Hi,
I have a question related to batch.size producer configuration.
What happens when batch.size has reached and the producer app thread sends
more data ?
Does the thread block till space becomes available in the buffer
containing the batch ?

Thanks,
Dhirendra.

Re: Producer batch size

Posted by Dhirendra Singh <dh...@gmail.com>.
Thanks Steve !
And yes by buffer.size i mean batch.size. Sorry for the typo.
Let me restate my question.
Lets assume producer i/o thread responsible to sending the messages to
brokers is slow and app thread calling send method is very fast.
While producer i/o thread is busy sending messages to brokers, app thread
which is writing the message to buffer wrote messages upto the batch.size.
What happens in this scenario ?
does the app thread continue to write to the same batch or it create a new
batch ?
If it continue to write to the same batch then i/o thread when available
send all the messages in the batch or only messages upto batch.size ?
If it create new batch then i/o thread when available send all the batches
or one batch at a time ?

Thanks,
Dhirendra.


On Thu, Dec 24, 2020 at 9:28 PM Steve Howard <st...@confluent.io>
wrote:

> If by buffer.size you mean batch.size, no it is very relevant.  The
> buffer.memory space is used to ensure the application can still produce
> messages for a period of time until the producer can keep up with the
> application.  The total time the producer has available to catch up is the
> sum of how long it takes to fill the default of 32MB + max.block.ms.
>
> batch.size controls (to some extent) how often the producer
> thread publishes messages to Kafka.  At a default of only 16K, it is a very
> small fraction of the size of the local buffer (buffer.memory) used to
> store messages prior to transmission.  This configuration however, can
> greatly affect throughput as well as latency.  If you increase batch.size
> to 1MB from the default of 16K, you will see far less roundtrips from the
> producer to Kafka.  This can often greatly increase the throughput of your
> application.  Conversely, if you set it to 0, you effectively disable
> batching, but may see improvements in latency.  The differences can be very
> large and noticeable to the application.
>
>
> On Wed, Dec 23, 2020 at 11:48 PM Dhirendra Singh <dh...@gmail.com>
> wrote:
>
> > Thanks steve !
> > So if I understand correctly, the number of messages buffered can be
> > greater than batch.size upto buffer.memory if the app is sending data
> > faster than the producer i/o thread can send to broker.
> > In this situation buffer.size becomes irrelevant. no ?
> >
> > Thanks,
> > Dhirendra.
> >
> > On Wed, Dec 23, 2020 at 11:36 PM Steve Howard <steve.howard@confluent.io
> >
> > wrote:
> >
> > > Hi Dhirenda,
> > >
> > > As long as buffer.memory (default 32MB) has space, the producer will
> > > continue to write here.  If that is exhausted, eventually the producer
> > will
> > > throw...
> > >
> > > org.apache.kafka.common.errors.TimeoutException: Failed to allocate
> > memory
> > > within the configured max blocking time 60000 ms
> > >
> > > The 60 seconds it is given to dutifully clear some room in
> buffer.memory
> > by
> > > successfully sending messages is controlled by max.block.ms.
> > >
> > > Thanks,
> > >
> > > Steve
> > >
> > > On Wed, Dec 23, 2020 at 12:11 AM Dhirendra Singh <
> dhirendraks@gmail.com>
> > > wrote:
> > >
> > > > Hi,
> > > > I have a question related to batch.size producer configuration.
> > > > What happens when batch.size has reached and the producer app thread
> > > sends
> > > > more data ?
> > > > Does the thread block till space becomes available in the buffer
> > > > containing the batch ?
> > > >
> > > > Thanks,
> > > > Dhirendra.
> > > >
> > >
> >
>

Re: Producer batch size

Posted by Steve Howard <st...@confluent.io>.
If by buffer.size you mean batch.size, no it is very relevant.  The
buffer.memory space is used to ensure the application can still produce
messages for a period of time until the producer can keep up with the
application.  The total time the producer has available to catch up is the
sum of how long it takes to fill the default of 32MB + max.block.ms.

batch.size controls (to some extent) how often the producer
thread publishes messages to Kafka.  At a default of only 16K, it is a very
small fraction of the size of the local buffer (buffer.memory) used to
store messages prior to transmission.  This configuration however, can
greatly affect throughput as well as latency.  If you increase batch.size
to 1MB from the default of 16K, you will see far less roundtrips from the
producer to Kafka.  This can often greatly increase the throughput of your
application.  Conversely, if you set it to 0, you effectively disable
batching, but may see improvements in latency.  The differences can be very
large and noticeable to the application.


On Wed, Dec 23, 2020 at 11:48 PM Dhirendra Singh <dh...@gmail.com>
wrote:

> Thanks steve !
> So if I understand correctly, the number of messages buffered can be
> greater than batch.size upto buffer.memory if the app is sending data
> faster than the producer i/o thread can send to broker.
> In this situation buffer.size becomes irrelevant. no ?
>
> Thanks,
> Dhirendra.
>
> On Wed, Dec 23, 2020 at 11:36 PM Steve Howard <st...@confluent.io>
> wrote:
>
> > Hi Dhirenda,
> >
> > As long as buffer.memory (default 32MB) has space, the producer will
> > continue to write here.  If that is exhausted, eventually the producer
> will
> > throw...
> >
> > org.apache.kafka.common.errors.TimeoutException: Failed to allocate
> memory
> > within the configured max blocking time 60000 ms
> >
> > The 60 seconds it is given to dutifully clear some room in buffer.memory
> by
> > successfully sending messages is controlled by max.block.ms.
> >
> > Thanks,
> >
> > Steve
> >
> > On Wed, Dec 23, 2020 at 12:11 AM Dhirendra Singh <dh...@gmail.com>
> > wrote:
> >
> > > Hi,
> > > I have a question related to batch.size producer configuration.
> > > What happens when batch.size has reached and the producer app thread
> > sends
> > > more data ?
> > > Does the thread block till space becomes available in the buffer
> > > containing the batch ?
> > >
> > > Thanks,
> > > Dhirendra.
> > >
> >
>

Re: Producer batch size

Posted by Dhirendra Singh <dh...@gmail.com>.
Thanks steve !
So if I understand correctly, the number of messages buffered can be
greater than batch.size upto buffer.memory if the app is sending data
faster than the producer i/o thread can send to broker.
In this situation buffer.size becomes irrelevant. no ?

Thanks,
Dhirendra.

On Wed, Dec 23, 2020 at 11:36 PM Steve Howard <st...@confluent.io>
wrote:

> Hi Dhirenda,
>
> As long as buffer.memory (default 32MB) has space, the producer will
> continue to write here.  If that is exhausted, eventually the producer will
> throw...
>
> org.apache.kafka.common.errors.TimeoutException: Failed to allocate memory
> within the configured max blocking time 60000 ms
>
> The 60 seconds it is given to dutifully clear some room in buffer.memory by
> successfully sending messages is controlled by max.block.ms.
>
> Thanks,
>
> Steve
>
> On Wed, Dec 23, 2020 at 12:11 AM Dhirendra Singh <dh...@gmail.com>
> wrote:
>
> > Hi,
> > I have a question related to batch.size producer configuration.
> > What happens when batch.size has reached and the producer app thread
> sends
> > more data ?
> > Does the thread block till space becomes available in the buffer
> > containing the batch ?
> >
> > Thanks,
> > Dhirendra.
> >
>

Re: Producer batch size

Posted by Steve Howard <st...@confluent.io>.
Hi Dhirenda,

As long as buffer.memory (default 32MB) has space, the producer will
continue to write here.  If that is exhausted, eventually the producer will
throw...

org.apache.kafka.common.errors.TimeoutException: Failed to allocate memory
within the configured max blocking time 60000 ms

The 60 seconds it is given to dutifully clear some room in buffer.memory by
successfully sending messages is controlled by max.block.ms.

Thanks,

Steve

On Wed, Dec 23, 2020 at 12:11 AM Dhirendra Singh <dh...@gmail.com>
wrote:

> Hi,
> I have a question related to batch.size producer configuration.
> What happens when batch.size has reached and the producer app thread sends
> more data ?
> Does the thread block till space becomes available in the buffer
> containing the batch ?
>
> Thanks,
> Dhirendra.
>