You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Joseph Aliase (JIRA)" <ji...@apache.org> on 2016/06/13 01:52:20 UTC

[jira] [Created] (KAFKA-3828) Consumer thread stalls after consumer re balance for some partition

Joseph Aliase created KAFKA-3828:
------------------------------------

             Summary: Consumer thread stalls after consumer re balance for some partition 
                 Key: KAFKA-3828
                 URL: https://issues.apache.org/jira/browse/KAFKA-3828
             Project: Kafka
          Issue Type: Bug
          Components: consumer
         Environment: Operating System : CentOS release 6.4
Kafka Cluster: Stand alone cluster with one broker and one zookeeper.
            Reporter: Joseph Aliase
            Assignee: Neha Narkhede


In process of testing the new Kafka Consumer API we came across this issue. We started single broker Kafka Cluster with broker listening on port 9092 and zookeeper on 2181.

We created a topic test with partition 6. We started a consumer with below configuration:

bootstrap.servers= host-name:9092
group.id=consumer-group
key.deserializer=StringDeserializer.class.getName()
value.deserializer=StringDeserializer.class.getName()
session.timeout.ms=30000
heartbeat.interval.ms=10000

We started producing data into topic test:
sh kafka-producer-perf-test.sh --topic test --num-records 1000000 --record-size 10 --throughput 500 --producer-props bootstrap.servers=localhost:9092

Consumer instance is started with 6 threads to consume data from 6 partition. 

We then restart another consumer instance with 6 threads. Consumer re-balance occurs and 6 partitions is divided equally among this two instance.

Then we start another consumer instance with 6 threads again we could see re-balance occurring with partition getting divided among three consumer instance. Everything works well.

Then if we stop one consumer instance and partitions get re-balanced between two instance. 

If we stop and restart the another running instances and repeat the steps for few time we could see the issue occurring where we could see Consumer is holding the partition's but not consuming any data from that partition. Partition data remain unconsumed until we stop the consumer instance which is holding the partition. 

We were not able to reproduce this issue we publish data to topic at very low rate however issue could be easily reproduced when data is being published at high rate.





--
This message was sent by Atlassian JIRA
(v6.3.4#6332)