You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Peter Davis (JIRA)" <ji...@apache.org> on 2016/07/02 19:37:10 UTC

[jira] [Comment Edited] (KAFKA-3893) Kafka Borker ID disappears from /borkers/ids

    [ https://issues.apache.org/jira/browse/KAFKA-3893?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15360306#comment-15360306 ] 

Peter Davis edited comment on KAFKA-3893 at 7/2/16 7:36 PM:
------------------------------------------------------------

Sriharsha, I have witnessed this too and it very much seems like a bug in Kafka -- when a zookeeper connection is lost, it seems any other changes in the cluster during the loss (which would be expected if an outage affects multiple brokers) are not recognized when it reconnects.  We see the same loop of "Shrinking ISR" and "Cached zkVerskom [###] not equal to that in zookeeper", and the broker never recovers until manually restarted. 

For us this happened almost daily when running on a cluster virtual machines that would get paused for a few seconds every night for a snapshot backup.  We disabled the backup but it's very concerning that Kafka won't recover after a pause!

Seen with 0.9 and 0.10. 


was (Author: davispw):
Sriharsha, I have witnessed this too and it very much seems like a bug in Kafka -- when a zookeeper connection is lost, any other changes in the cluster during the loss are not recognized when it reconnects.  We see the same loop of "Shrinking ISR" and "Cached zkVerskom [###] not equal to that in zookeeper", and the broker never recovers. 

For us this happened almost daily when running on a cluster virtual machines that would get paused for a few seconds every night for a snapshot backup.  We disabled the backup but it's very concerning that Kafka won't recover after a pause!

> Kafka Borker ID disappears from /borkers/ids
> --------------------------------------------
>
>                 Key: KAFKA-3893
>                 URL: https://issues.apache.org/jira/browse/KAFKA-3893
>             Project: Kafka
>          Issue Type: Bug
>            Reporter: chaitra
>            Priority: Critical
>
> Kafka version used : 0.8.2.1 
> Zookeeper version: 3.4.6
> We have scenario where kafka 's broker in  zookeeper path /brokers/ids just disappears.
> We see the zookeeper connection active and no network issue.
> The zookeeper conection timeout is set to 6000ms in server.properties
> Hence Kafka not participating in cluster



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)