You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Mads Tandrup <Ma...@schneider-electric.com> on 2018/05/04 11:19:06 UTC
What is the performance impact of setting max.poll.records=1
Hi
What is the performance impact of setting `max.poll.records=1` as opposed to the default of 500?
I have a Java application which process records one at a time. The processing time varies between messages, so we sometimes exceed the `max.poll.interval.ms`.
While I could increase `max.poll.interval.ms` it would prevent me from detecting a livelock in the application quickly.
There's no benefit of batching the records so I'm considering setting `max.poll.records=1`. We can define a sensible upper limit for the processing time of a single record.
I've tried to look at the code and it seems that it fetches up to ` fetch.max.bytes` and then keep it in-memory and returns records from the fetched data when `poll()` is called.
So what is the performance impact of a low `max.poll.records`?
Best regards,
Mads
Re: What is the performance impact of setting max.poll.records=1
Posted by "Matthias J. Sax" <ma...@confluent.io>.
`max.poll.records` only configures how many records are returned from
poll(). Internally, the consumer buffers a batch or records and only if
this batch is empty, if will do a new fetch request within poll().
-Matthias
On 5/10/18 10:46 PM, Mads Tandrup wrote:
> Hi
>
> I forgot to metion that I have multiple partitions and multiple consumer processes.
> But we can't process the messages in the same partition in parallel since they might influence the processing of later records.
>
> Does max.poll.records=1 always go to the remote server each time? What if I increase fetch.min.bytes to say the expected size of 10 records. What will then happen?
>
> Best regards,
> Mads
>
> D. 07/05/2018 06.36 skrev "R Krishna" <kr...@gmail.com>:
>
> You can always add more partitions/consumer threads each fetching a few
> more records than 1 but manually commit asynchronously one at a time, not
> the best but better than doing max.poll.records=1 which fetches one record
> from remote server at a time.
>
> On Fri, May 4, 2018 at 4:19 AM, Mads Tandrup <
> Mads.Tandrup@schneider-electric.com> wrote:
>
> > Hi
> >
> > What is the performance impact of setting `max.poll.records=1` as opposed
> > to the default of 500?
> >
> > I have a Java application which process records one at a time. The
> > processing time varies between messages, so we sometimes exceed the `
> > max.poll.interval.ms`.
> > While I could increase `max.poll.interval.ms` it would prevent me from
> > detecting a livelock in the application quickly.
> > There's no benefit of batching the records so I'm considering setting
> > `max.poll.records=1`. We can define a sensible upper limit for the
> > processing time of a single record.
> >
> > I've tried to look at the code and it seems that it fetches up to `
> > fetch.max.bytes` and then keep it in-memory and returns records from the
> > fetched data when `poll()` is called.
> >
> > So what is the performance impact of a low `max.poll.records`?
> >
> > Best regards,
> > Mads
> >
> >
>
>
> --
> Radha Krishna, Proddaturi
> 253-234-5657
>
>
> ______________________________________________________________________
> This email has been scanned by the Symantec Email Security.cloud service.
> ______________________________________________________________________
>
Re: What is the performance impact of setting max.poll.records=1
Posted by Mads Tandrup <Ma...@schneider-electric.com>.
Hi
I forgot to metion that I have multiple partitions and multiple consumer processes.
But we can't process the messages in the same partition in parallel since they might influence the processing of later records.
Does max.poll.records=1 always go to the remote server each time? What if I increase fetch.min.bytes to say the expected size of 10 records. What will then happen?
Best regards,
Mads
D. 07/05/2018 06.36 skrev "R Krishna" <kr...@gmail.com>:
You can always add more partitions/consumer threads each fetching a few
more records than 1 but manually commit asynchronously one at a time, not
the best but better than doing max.poll.records=1 which fetches one record
from remote server at a time.
On Fri, May 4, 2018 at 4:19 AM, Mads Tandrup <
Mads.Tandrup@schneider-electric.com> wrote:
> Hi
>
> What is the performance impact of setting `max.poll.records=1` as opposed
> to the default of 500?
>
> I have a Java application which process records one at a time. The
> processing time varies between messages, so we sometimes exceed the `
> max.poll.interval.ms`.
> While I could increase `max.poll.interval.ms` it would prevent me from
> detecting a livelock in the application quickly.
> There's no benefit of batching the records so I'm considering setting
> `max.poll.records=1`. We can define a sensible upper limit for the
> processing time of a single record.
>
> I've tried to look at the code and it seems that it fetches up to `
> fetch.max.bytes` and then keep it in-memory and returns records from the
> fetched data when `poll()` is called.
>
> So what is the performance impact of a low `max.poll.records`?
>
> Best regards,
> Mads
>
>
--
Radha Krishna, Proddaturi
253-234-5657
______________________________________________________________________
This email has been scanned by the Symantec Email Security.cloud service.
______________________________________________________________________
Re: What is the performance impact of setting max.poll.records=1
Posted by R Krishna <kr...@gmail.com>.
You can always add more partitions/consumer threads each fetching a few
more records than 1 but manually commit asynchronously one at a time, not
the best but better than doing max.poll.records=1 which fetches one record
from remote server at a time.
On Fri, May 4, 2018 at 4:19 AM, Mads Tandrup <
Mads.Tandrup@schneider-electric.com> wrote:
> Hi
>
> What is the performance impact of setting `max.poll.records=1` as opposed
> to the default of 500?
>
> I have a Java application which process records one at a time. The
> processing time varies between messages, so we sometimes exceed the `
> max.poll.interval.ms`.
> While I could increase `max.poll.interval.ms` it would prevent me from
> detecting a livelock in the application quickly.
> There's no benefit of batching the records so I'm considering setting
> `max.poll.records=1`. We can define a sensible upper limit for the
> processing time of a single record.
>
> I've tried to look at the code and it seems that it fetches up to `
> fetch.max.bytes` and then keep it in-memory and returns records from the
> fetched data when `poll()` is called.
>
> So what is the performance impact of a low `max.poll.records`?
>
> Best regards,
> Mads
>
>
--
Radha Krishna, Proddaturi
253-234-5657