You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Manikumar (JIRA)" <ji...@apache.org> on 2018/09/13 08:49:00 UTC

[jira] [Resolved] (KAFKA-1825) leadership election state is stale and never recovers without all brokers restarting

     [ https://issues.apache.org/jira/browse/KAFKA-1825?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Manikumar resolved KAFKA-1825.
------------------------------
    Resolution: Auto Closed

Closing inactive issue. Please reopen if the issue still exists on newer versions.

> leadership election state is stale and never recovers without all brokers restarting
> ------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1825
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1825
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8.1.1, 0.8.2.0
>            Reporter: Joe Stein
>            Priority: Critical
>         Attachments: KAFKA-1825.executable.tgz
>
>
> I am not sure what is the cause here but I can succinctly and repeatedly  reproduce this issue. I tried with 0.8.1.1 and 0.8.2-beta and both behave in the same manner.
> The code to reproduce this is here https://github.com/stealthly/go_kafka_client/tree/wipAsyncSaramaProducer/producers
> scenario 3 brokers, 1 zookeeper, 1 client (each AWS c3.2xlarge instances)
> create topic 
> producer client sends in 380,000 messages/sec (attached executable)
> everything is fine until you kill -SIGTERM broker #2 
> then at that point the state goes bad for that topic.  even trying to use the console producer (with the sarama producer off) doesn't work.
> doing a describe the yoyoma topic looks fine, ran prefered leadership election lots of issues... still can't produce... only resolution is bouncing all brokers :(
> root@ip-10-233-52-139:/opt/kafka_2.10-0.8.1.1# bin/kafka-topics.sh --zookeeper 10.218.189.234:2181 --describe
> Topic:yoyoma	PartitionCount:36	ReplicationFactor:3	Configs:
> 	Topic: yoyoma	Partition: 0	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 1	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 2	Leader: 1	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 3	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 4	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 5	Leader: 1	Replicas: 3,2,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 6	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 7	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 8	Leader: 1	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 9	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 10	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 11	Leader: 1	Replicas: 3,2,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 12	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 13	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 14	Leader: 1	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 15	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 16	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 17	Leader: 1	Replicas: 3,2,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 18	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 19	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 20	Leader: 1	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 21	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 22	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 23	Leader: 1	Replicas: 3,2,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 24	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 25	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 26	Leader: 1	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 27	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 28	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 29	Leader: 1	Replicas: 3,2,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 30	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 31	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 32	Leader: 1	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 33	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 34	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 35	Leader: 1	Replicas: 3,2,1	Isr: 1,3
> root@ip-10-233-52-139:/opt/kafka_2.10-0.8.1.1# bin/kafka-preferred-replica-election.sh --zookeeper 10.218.189.234:2181
> Successfully started preferred replica election for partitions Set([yoyoma,29], [yoyoma,14], [yoyoma,22], [yoyoma,15], [yoyoma,3], [yoyoma,11], [yoyoma,32], [yoyoma,23], [yoyoma,18], [yoyoma,25], [yoyoma,26], [yoyoma,1], [yoyoma,9], [yoyoma,33], [yoyoma,5], [yoyoma,12], [yoyoma,20], [yoyoma,4], [yoyoma,7], [yoyoma,24], [yoyoma,35], [yoyoma,10], [yoyoma,8], [yoyoma,2], [yoyoma,21], [yoyoma,31], [yoyoma,28], [yoyoma,19], [yoyoma,16], [yoyoma,13], [yoyoma,34], [yoyoma,0], [test-1210,0], [yoyoma,30], [yoyoma,27], [yoyoma,17], [yoyoma,6])
> [2014-12-19 18:33:56,228] INFO [ReplicaFetcherManager on broker 1] Removed fetcher for partitions [yoyoma,29],[yoyoma,14],[yoyoma,11],[yoyoma,32],[yoyoma,23],[yoyoma,26],[yoyoma,5],[yoyoma,20],[yoyoma,35],[yoyoma,8],[yoyoma,2],[yoyoma,17] (kafka.server.ReplicaFetcherManager)
> [2014-12-19 18:33:56,229] INFO Truncating log yoyoma-29 to offset 6481451. (kafka.log.Log)
> [2014-12-19 18:33:56,229] INFO Truncating log yoyoma-14 to offset 6469671. (kafka.log.Log)
> [2014-12-19 18:33:56,229] INFO Truncating log yoyoma-11 to offset 6472578. (kafka.log.Log)
> [2014-12-19 18:33:56,229] INFO Truncating log yoyoma-32 to offset 6481923. (kafka.log.Log)
> [2014-12-19 18:33:56,230] INFO Truncating log yoyoma-23 to offset 6473039. (kafka.log.Log)
> [2014-12-19 18:33:56,230] INFO Truncating log yoyoma-26 to offset 6478089. (kafka.log.Log)
> [2014-12-19 18:33:56,230] INFO Truncating log yoyoma-5 to offset 6473159. (kafka.log.Log)
> [2014-12-19 18:33:56,230] INFO Truncating log yoyoma-20 to offset 6474790. (kafka.log.Log)
> [2014-12-19 18:33:56,230] INFO Truncating log yoyoma-35 to offset 6482661. (kafka.log.Log)
> [2014-12-19 18:33:56,230] INFO Truncating log yoyoma-8 to offset 6467814. (kafka.log.Log)
> [2014-12-19 18:33:56,231] INFO Truncating log yoyoma-2 to offset 6477942. (kafka.log.Log)
> [2014-12-19 18:33:56,231] INFO Truncating log yoyoma-17 to offset 6476136. (kafka.log.Log)
> [2014-12-19 18:33:56,241] INFO [ReplicaFetcherThread-2-3], Starting  (kafka.server.ReplicaFetcherThread)
> [2014-12-19 18:33:56,243] INFO [ReplicaFetcherThread-1-3], Starting  (kafka.server.ReplicaFetcherThread)
> [2014-12-19 18:33:56,244] INFO [ReplicaFetcherThread-3-3], Starting  (kafka.server.ReplicaFetcherThread)
> [2014-12-19 18:33:56,245] INFO [ReplicaFetcherThread-0-3], Starting  (kafka.server.ReplicaFetcherThread)
> [2014-12-19 18:33:56,245] INFO [ReplicaFetcherManager on broker 1] Added fetcher for partitions ArrayBuffer([[yoyoma,23], initOffset 6473039 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,17], initOffset 6476136 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,32], initOffset 6481923 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,14], initOffset 6469671 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,20], initOffset 6474790 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,8], initOffset 6467814 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,5], initOffset 6473159 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,35], initOffset 6482661 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,2], initOffset 6477942 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,11], initOffset 6472578 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,26], initOffset 6478089 to broker id:3,host:10.51.176.70,port:9092] , [[yoyoma,29], initOffset 6481451 to broker id:3,host:10.51.176.70,port:9092] ) (kafka.server.ReplicaFetcherManager)
> [2014-12-19 18:33:56,289] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-1-1 on partition [yoyoma,29] failed due to Leader not local for partition [yoyoma,29] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,290] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-1-1 on partition [yoyoma,5] failed due to Leader not local for partition [yoyoma,5] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,290] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-1-1 on partition [yoyoma,17] failed due to Leader not local for partition [yoyoma,17] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,290] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-3-1 on partition [yoyoma,11] failed due to Leader not local for partition [yoyoma,11] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,290] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-3-1 on partition [yoyoma,23] failed due to Leader not local for partition [yoyoma,23] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,290] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-3-1 on partition [yoyoma,35] failed due to Leader not local for partition [yoyoma,35] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,290] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-2-1 on partition [yoyoma,14] failed due to Leader not local for partition [yoyoma,14] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,290] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-2-1 on partition [yoyoma,26] failed due to Leader not local for partition [yoyoma,26] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,291] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-2-1 on partition [yoyoma,2] failed due to Leader not local for partition [yoyoma,2] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,334] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-0-1 on partition [yoyoma,32] failed due to Leader not local for partition [yoyoma,32] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,334] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-0-1 on partition [yoyoma,20] failed due to Leader not local for partition [yoyoma,20] on broker 1 (kafka.server.KafkaApis)
> [2014-12-19 18:33:56,334] WARN [KafkaApi-1] Fetch request with correlation id 1845 from client ReplicaFetcherThread-0-1 on partition [yoyoma,8] failed due to Leader not local for partition [yoyoma,8] on broker 1 (kafka.server.KafkaApis)
> root@ip-10-233-52-139:/opt/kafka_2.10-0.8.1.1# bin/kafka-topics.sh --zookeeper 10.218.189.234:2181 --describe
> Topic:yoyoma	PartitionCount:36	ReplicationFactor:3	Configs:
> 	Topic: yoyoma	Partition: 0	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 1	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 2	Leader: 3	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 3	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 4	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 5	Leader: 3	Replicas: 3,2,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 6	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 7	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 8	Leader: 3	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 9	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 10	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 11	Leader: 3	Replicas: 3,2,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 12	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 13	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 14	Leader: 3	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 15	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 16	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 17	Leader: 3	Replicas: 3,2,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 18	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 19	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 20	Leader: 3	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 21	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 22	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 23	Leader: 3	Replicas: 3,2,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 24	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 25	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 26	Leader: 3	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 27	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 28	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 29	Leader: 3	Replicas: 3,2,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 30	Leader: 1	Replicas: 1,2,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 31	Leader: 1	Replicas: 2,3,1	Isr: 1,3
> 	Topic: yoyoma	Partition: 32	Leader: 3	Replicas: 3,1,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 33	Leader: 1	Replicas: 1,3,2	Isr: 1,3
> 	Topic: yoyoma	Partition: 34	Leader: 1	Replicas: 2,1,3	Isr: 1,3
> 	Topic: yoyoma	Partition: 35	Leader: 3	Replicas: 3,2,1	Isr: 1,3



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)