You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Ken Chen <zl...@gmail.com> on 2018/06/01 17:10:16 UTC

Re: Frequent consumer rebalances, auto commit failures

1. Any detail logs ?
2. How do you process the records after you polled the records?
3. How much time does it take for every round of poll ? 

Thanks !

--
Sent from my iPhone

On May 28, 2018, at 10:44 PM, Shantanu Deshmukh <sh...@gmail.com> wrote:

Can anyone here help me please? I am at my wit's end. I now have
max.poll.records set to just 2. Still I am getting Auto offset commit
failed warning. Log file is getting full because of this warning. Session
timeout is 5 minutes, max.poll.interval.ms is 10 minutes.

On Wed, May 23, 2018 at 12:42 PM Shantanu Deshmukh <sh...@gmail.com>
wrote:

> 
> Hello,
> 
> We have a 3 broker Kafka 0.10.0.1 cluster. There we have 3 topics with 10
> partitions each. We have an application which spawns threads as consumers.
> We spawn 5 consumers for each topic. I am observing that consider group
> randomly keeps rebalancing. Then many times we see logs saying "Revoking
> partitions for". This happens almost every 10 minutes. Consumption during
> this time completely stops.
> 
> I have applied this configuration
> max.poll.records 20
> heartbeat.interval.ms 10000
> Session.timeout.ms 6000
> 
> Still this did not help. Strange thing is I observed consumer writing logs
> saying "auto commit failed because poll() loop spent too much time
> processing records" even when there was no data in partition to process. We
> have polling interval of 500 ms, specified as argument in poll(). Initially
> I had set same consumer group for all three topics' consumers. Then I
> specified different CGs for different topics' consumers. Even this is not
> helping.
> 
> I am trying to search over the web, checked my code, tried many
> combinations of configuration but still no luck. Please help me.
> 
> Thanks & Regards,
> 
> Shantanu Deshmukh
> 

Re: Frequent consumer rebalances, auto commit failures

Posted by Shantanu Deshmukh <sh...@gmail.com>.
Hi,
I do not have trace level logs as of now.
I am doing very basic operation with messages. The only time consuming part
is sending an e-mail. Our Email servers are very slow so sending one email
is taking upto 20 seconds. That's why I turned max.poll.records to just 2,
keppt session time out at 10 minutes. Still rebalances would happen.

However, there's an update. When I was trying potential config tuning I set
max.poll.interval.ms to 3 minutes. Later on I found that this setting is
not meant for Kafka 0.10.0.1 which we are using. So I removed that setting.
Now after more than a week since that was done, I haven't seen any
rebalance issue. But, still slow consumer startup issue persists.  Whenever
I restart my consumer process for almost 5 minutes there is no activity. I
checked in broker logs at that time I saw message "preparing to stabilise
consumer group", then there is a gap of 5 minutes and message "stabilized
group". What could be happening here?

On Fri, Jun 1, 2018 at 10:40 PM Ken Chen <zl...@gmail.com> wrote:

> 1. Any detail logs ?
> 2. How do you process the records after you polled the records?
> 3. How much time does it take for every round of poll ?
>
> Thanks !
>
> --
> Sent from my iPhone
>
> On May 28, 2018, at 10:44 PM, Shantanu Deshmukh <sh...@gmail.com>
> wrote:
>
> Can anyone here help me please? I am at my wit's end. I now have
> max.poll.records set to just 2. Still I am getting Auto offset commit
> failed warning. Log file is getting full because of this warning. Session
> timeout is 5 minutes, max.poll.interval.ms is 10 minutes.
>
> On Wed, May 23, 2018 at 12:42 PM Shantanu Deshmukh <sh...@gmail.com>
> wrote:
>
> >
> > Hello,
> >
> > We have a 3 broker Kafka 0.10.0.1 cluster. There we have 3 topics with 10
> > partitions each. We have an application which spawns threads as
> consumers.
> > We spawn 5 consumers for each topic. I am observing that consider group
> > randomly keeps rebalancing. Then many times we see logs saying "Revoking
> > partitions for". This happens almost every 10 minutes. Consumption during
> > this time completely stops.
> >
> > I have applied this configuration
> > max.poll.records 20
> > heartbeat.interval.ms 10000
> > Session.timeout.ms 6000
> >
> > Still this did not help. Strange thing is I observed consumer writing
> logs
> > saying "auto commit failed because poll() loop spent too much time
> > processing records" even when there was no data in partition to process.
> We
> > have polling interval of 500 ms, specified as argument in poll().
> Initially
> > I had set same consumer group for all three topics' consumers. Then I
> > specified different CGs for different topics' consumers. Even this is not
> > helping.
> >
> > I am trying to search over the web, checked my code, tried many
> > combinations of configuration but still no luck. Please help me.
> >
> > Thanks & Regards,
> >
> > Shantanu Deshmukh
> >
>