You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Andrej Urvantsev (JIRA)" <ji...@apache.org> on 2018/02/06 18:08:00 UTC

[jira] [Commented] (KAFKA-5971) Broker keeps running even though not registered in ZK

    [ https://issues.apache.org/jira/browse/KAFKA-5971?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16354265#comment-16354265 ] 

Andrej Urvantsev commented on KAFKA-5971:
-----------------------------------------

I think that we faced the same problem. Running Kafka 1.0.0
{noformat}
[2018-02-05 19:43:23,460] INFO Unable to read additional data from server sessionid 0x101aa15a51e000a, likely server has closed socket, closing socket co
nnection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:23,561] INFO zookeeper state changed (Disconnected) (org.I0Itec.zkclient.ZkClient)
[2018-02-05 19:43:23,584] INFO Opening socket connection to server 172.30.35.44/172.30.35.44:2181. Will not attempt to authenticate using SASL (unknown e
rror) (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:23,673] INFO Socket connection established to 172.30.35.44/172.30.35.44:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:23,700] WARN Unable to reconnect to ZooKeeper service, session 0x101aa15a51e000a has expired (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:23,700] INFO zookeeper state changed (Expired) (org.I0Itec.zkclient.ZkClient)
[2018-02-05 19:43:23,700] INFO Unable to reconnect to ZooKeeper service, session 0x101aa15a51e000a has expired, closing socket connection (org.apache.zoo
keeper.ClientCnxn)
[2018-02-05 19:43:23,700] INFO Initiating client connection, connectString=172.30.34.209:2181,172.30.36.237:2181,172.30.35.44:2181 sessionTimeout=6000 wa
tcher=org.I0Itec.zkclient.ZkClient@6572421 (org.apache.zookeeper.ZooKeeper)
[2018-02-05 19:43:23,703] INFO Opening socket connection to server 172.30.34.209/172.30.34.209:2181. Will not attempt to authenticate using SASL (unknown
 error) (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:23,704] INFO Socket connection established to 172.30.34.209/172.30.34.209:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:23,704] INFO EventThread shut down for session: 0x101aa15a51e000a (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:23,705] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and a
ttempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:23,805] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2018-02-05 19:43:23,805] INFO Waiting for keeper state SyncConnected (org.I0Itec.zkclient.ZkClient)
[2018-02-05 19:43:24,523] INFO Opening socket connection to server 172.30.36.237/172.30.36.237:2181. Will not attempt to authenticate using SASL (unknown
 error) (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:24,523] INFO Socket connection established to 172.30.36.237/172.30.36.237:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:24,524] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and a
ttempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:24,886] INFO Socket connection established to 172.30.35.44/172.30.35.44:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:25,460] INFO Unable to read additional data from server sessionid 0x0, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:27,327] INFO Opening socket connection to server 172.30.34.209/172.30.34.209:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:27,328] INFO Socket connection established to 172.30.34.209/172.30.34.209:2181, initiating session (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:27,332] INFO Session establishment complete on server 172.30.34.209/172.30.34.209:2181, sessionid = 0x101b18ebf3b0001, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
[2018-02-05 19:43:27,332] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
[2018-02-05 19:43:27,334] INFO re-registering broker info in ZK for broker 3523 (kafka.server.KafkaHealthcheck$SessionExpireListener)
[2018-02-05 19:43:27,334] INFO Creating /brokers/ids/3523 (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2018-02-05 19:43:27,340] INFO Result of znode creation is: NODEEXISTS (kafka.utils.ZKCheckedEphemeral)
[2018-02-05 19:43:27,341] ERROR Error handling event ZkEvent[New session event sent to kafka.server.KafkaHealthcheck$SessionExpireListener@493b2c1d] (org.I0Itec.zkclient.ZkEventThread)
java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/3523. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it appears to be re-registering.
        at kafka.utils.ZkUtils.registerBrokerInZk(ZkUtils.scala:440)
        at kafka.utils.ZkUtils.registerBrokerInZk(ZkUtils.scala:426)
        at kafka.server.KafkaHealthcheck.register(KafkaHealthcheck.scala:73)
        at kafka.server.KafkaHealthcheck$SessionExpireListener.handleNewSession(KafkaHealthcheck.scala:121)
        at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:736)
        at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:72)
[2018-02-05 19:43:31,070] INFO Creating /controller (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
[2018-02-05 19:43:31,074] INFO Result of znode creation is: OK (kafka.utils.ZKCheckedEphemeral)
[2018-02-05 19:46:43,829] INFO [GroupMetadataManager brokerId=3523] Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
{noformat}

> Broker keeps running even though not registered in ZK
> -----------------------------------------------------
>
>                 Key: KAFKA-5971
>                 URL: https://issues.apache.org/jira/browse/KAFKA-5971
>             Project: Kafka
>          Issue Type: Bug
>          Components: core
>    Affects Versions: 0.11.0.0
>            Reporter: Igor Canadi
>            Priority: Major
>
> We had a curious situation happen to our kafka cluster running version 0.11.0.0. One of the brokers was happily running, even though its ID was not registered in Zookeeper under `/brokers/ids`.
> Based on the logs, it appears that the broker restarted very quickly and there was a node under `/brokers/ids/2` still present from the previous run. However, in that case I'd expect the broker to try again or just exit. In reality it continued running without any errors in the logs.
> Here's the relevant part of the logs: 
> ```
> [2017-09-06 23:50:26,095] INFO Opening socket connection to server zookeeper.kafka.svc.cluster.local/100.66.99.54:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,096] INFO Socket connection established to zookeeper.kafka.svc.cluster.local/100.66.99.54:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,099] WARN Unable to reconnect to ZooKeeper service, session 0x15e4477405f1d40 has expired (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,099] INFO zookeeper state changed (Expired) (org.I0Itec.zkclient.ZkClient)
> [2017-09-06 23:50:26,099] INFO Unable to reconnect to ZooKeeper service, session 0x15e4477405f1d40 has expired, closing socket connection (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,099] INFO Initiating client connection, connectString=zookeeper:2181 sessionTimeout=6000 watcher=org.I0Itec.zkclient.ZkClient@2cb4893b (org.apache.zookeeper.ZooKeeper)
> [2017-09-06 23:50:26,102] INFO EventThread shut down for session: 0x15e4477405f1d40 (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,107] INFO Opening socket connection to server zookeeper.kafka.svc.cluster.local/100.66.99.54:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,108] INFO Socket connection established to zookeeper.kafka.svc.cluster.local/100.66.99.54:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,111] INFO Session establishment complete on server zookeeper.kafka.svc.cluster.local/100.66.99.54:2181, sessionid = 0x15e599a1a3e0013, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
> [2017-09-06 23:50:26,112] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
> [2017-09-06 23:50:26,114] INFO re-registering broker info in ZK for broker 2 (kafka.server.KafkaHealthcheck$SessionExpireListener)
> [2017-09-06 23:50:26,115] INFO Creating /brokers/ids/2 (is it secure? false) (kafka.utils.ZKCheckedEphemeral)
> [2017-09-06 23:50:26,123] INFO Result of znode creation is: NODEEXISTS (kafka.utils.ZKCheckedEphemeral)
> [2017-09-06 23:50:26,124] ERROR Error handling event ZkEvent[New session event sent to kafka.server.KafkaHealthcheck$SessionExpireListener@699f40a0] (org.I0Itec.zkclient.ZkEventThread)
> java.lang.RuntimeException: A broker is already registered on the path /brokers/ids/2. This probably indicates that you either have configured a brokerid that is already in use, or else you have shutdown this broker and restarted it faster than the zookeeper timeout so it
>     at kafka.utils.ZkUtils.registerBrokerInZk(ZkUtils.scala:417)
>     at kafka.utils.ZkUtils.registerBrokerInZk(ZkUtils.scala:403)
>     at kafka.server.KafkaHealthcheck.register(KafkaHealthcheck.scala:70)
>     at kafka.server.KafkaHealthcheck$SessionExpireListener.handleNewSession(KafkaHealthcheck.scala:104)
>     at org.I0Itec.zkclient.ZkClient$6.run(ZkClient.java:736)
>     at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:72)
> [2017-09-06 23:51:42,257] INFO [Group Metadata Manager on Broker 2]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
> [2017-09-07 00:00:06,198] INFO Unable to read additional data from server sessionid 0x15e599a1a3e0013, likely server has closed socket, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2017-09-07 00:00:06,354] INFO zookeeper state changed (Disconnected) (org.I0Itec.zkclient.ZkClient)
> [2017-09-07 00:00:07,675] INFO Opening socket connection to server zookeeper.kafka.svc.cluster.local/100.66.99.54:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2017-09-07 00:00:07,676] INFO Socket connection established to zookeeper.kafka.svc.cluster.local/100.66.99.54:2181, initiating session (org.apache.zookeeper.ClientCnxn)
> [2017-09-07 00:00:07,680] INFO Session establishment complete on server zookeeper.kafka.svc.cluster.local/100.66.99.54:2181, sessionid = 0x15e599a1a3e0013, negotiated timeout = 6000 (org.apache.zookeeper.ClientCnxn)
> [2017-09-07 00:00:07,681] INFO zookeeper state changed (SyncConnected) (org.I0Itec.zkclient.ZkClient)
> [2017-09-07 00:01:42,257] INFO [Group Metadata Manager on Broker 2]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
> [2017-09-07 00:11:42,257] INFO [Group Metadata Manager on Broker 2]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
> [2017-09-07 00:21:42,257] INFO [Group Metadata Manager on Broker 2]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
> [2017-09-07 00:31:42,257] INFO [Group Metadata Manager on Broker 2]: Removed 0 expired offsets in 0 milliseconds. (kafka.coordinator.group.GroupMetadataManager)
> ```
> The only message that appears after this point is the "Removed 0 expired offsets", which happens every 10min.
> Let me know if I can provide any more information!



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)