You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Omid Aladini (JIRA)" <ji...@apache.org> on 2015/02/04 15:02:35 UTC

[jira] [Created] (KAFKA-1918) System test for ZooKeeper quorum failure scenarios

Omid Aladini created KAFKA-1918:
-----------------------------------

             Summary: System test for ZooKeeper quorum failure scenarios
                 Key: KAFKA-1918
                 URL: https://issues.apache.org/jira/browse/KAFKA-1918
             Project: Kafka
          Issue Type: Test
            Reporter: Omid Aladini


Following up on the [conversation on the mailing list|http://mail-archives.apache.org/mod_mbox/kafka-users/201502.mbox/%3CCAHwHRrX3SAWDUGF5LjU4rrMUsqv%3DtJcyjX7OENeL5C_V5o3tCw%40mail.gmail.com%3E], the FAQ writes:

{quote}
Once the Zookeeper quorum is down, brokers could result in a bad state and could not normally serve client requests, etc. Although when Zookeeper quorum recovers, the Kafka brokers should be able to resume to normal state automatically, _there are still a few +corner cases+ the they cannot and a hard kill-and-recovery is required to bring it back to normal_. Hence it is recommended to closely monitor your zookeeper cluster and provision it so that it is performant.
{quote}

As ZK quorum failures are inevitable (due to rolling upgrades of ZK, leader hardware failure, etc), it would be great to identify the corner cases (if they still exist) and fix them if necessary.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)