You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Gwen Shapira (JIRA)" <ji...@apache.org> on 2014/10/31 03:25:34 UTC

[jira] [Assigned] (KAFKA-1082) zkclient dies after UnknownHostException in zk reconnect

     [ https://issues.apache.org/jira/browse/KAFKA-1082?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Gwen Shapira reassigned KAFKA-1082:
-----------------------------------

    Assignee: Gwen Shapira

> zkclient dies after UnknownHostException in zk reconnect
> --------------------------------------------------------
>
>                 Key: KAFKA-1082
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1082
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.7.2, 0.8.0
>            Reporter: Anatoly Fayngelerin
>            Assignee: Gwen Shapira
>         Attachments: KAFKA-1082.patch
>
>
> Moving this here from the dev list:
> I've run into the following issue with the Kafka server. The zkclient lib seems to die silently if there is an UnknownHostException(or any IOException) while reconnecting the ZK session. I've filed a bug about this with the zkclient lib(https://github.com/sgroschupf/zkclient/issues/23). The ramifications for Kafka were the silent loss of all ephemeral nodes associated with the affected process. 
> It is fairly easy to reproduce this locally using the following steps:
> -- Configure a local kafka broker to connect to a local ZK instance using a DNS alias(e.g.  add "127.0.0.1 kafka-test-dns" to your /etc/hosts)
> -- Start the broker, observe that ephemeral nodes have been added to ZK
> -- Suspend the broker process, preventing it from sending heartbeats to the ZK instance. Observe the loss of ephemeral nodes in ZK.
> -- Remove the DNS alias(e.g. comment out the /etc/hosts line).
> -- Upon resuming the broker, the UknownHostException is logged. After this point, the server cannot re-establish its ZK connection. Re-enabling the alias, for example, does not resume normal operation. The broker continues accepting requests, without participating in the ZK protocols.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)