You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Ramkumar (JIRA)" <ji...@apache.org> on 2018/04/03 20:21:00 UTC

[jira] [Created] (KAFKA-6745) kafka consumer rebalancing takes long time (from 3 secs to 5 minutes)

Ramkumar created KAFKA-6745:
-------------------------------

             Summary: kafka consumer rebalancing takes long time (from 3 secs to 5 minutes)
                 Key: KAFKA-6745
                 URL: https://issues.apache.org/jira/browse/KAFKA-6745
             Project: Kafka
          Issue Type: Improvement
          Components: clients, core
    Affects Versions: 0.11.0.0
            Reporter: Ramkumar


Hi, We had an HTTP service 3 nodes around Kafka 0.8 . This http service acts as a REST api for the publishers and consumers to use middleware intead of using kafka client api. Here the when the consumers rebalance is not a major issue.

We wanted to upgrade to kafka 0.11 , we have updated our http services (3 node cluster) to use new Kafka consumer API , but it takes rebalancing of consumer (multiple consumer under same Group) between secs to 5 mins (max.poll.interval.ms). Because of this time our http clients are timing out and do failover. This rebalancing time is major issue. It is not clear from the documentation ,that rebalance activity for the group takes place after max.poll.interval.ms  or it starts after 3 secs and complete any time with in 5 minutes. We tried to reduce max.poll.interval.ms   to 15 seconds. but this also triggers rebalance internally.

Below are the other parameters we have set In our service
max.poll.interval.ms = 30 sec
 seconds heartbeat.interval.ms = 1
minute session.timeout.ms = 4
minutes consumer.cache.timeout = 2 min
 
 
below is the log
""2018-03-26 12:53:23,009 [qtp1404928347-11556] INFO  org.apache.kafka.clients.consumer.internals.AbstractCoordinator - (Re-)joining group firstnetportal_001

""2018-03-26 12:57:52,793 [qtp1404928347-11556] INFO  org.apache.kafka.clients.consumer.internals.AbstractCoordinator - Successfully joined group firstnetportal_001 with generation 7475

Please let me know if there are any other application/client use http interace in 3 nodes with out any having this  issue
 
 
 
 
 
 
 



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)