You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Edoardo Comar (JIRA)" <ji...@apache.org> on 2018/05/01 09:35:00 UTC

[jira] [Commented] (KAFKA-6839) ZK session retry with cname record

    [ https://issues.apache.org/jira/browse/KAFKA-6839?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16459559#comment-16459559 ] 

Edoardo Comar commented on KAFKA-6839:
--------------------------------------

Java does DNS caching

[https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/java-dg-jvm-ttl.html]

 

 

> ZK session retry with cname record
> ----------------------------------
>
>                 Key: KAFKA-6839
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6839
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 1.1.0
>            Reporter: Tyler Monahan
>            Priority: Major
>
> I have a 3 node kafka cluster setup in aws that talks to a 3 node zk cluster behind an elb. I am giving the kafka instances a dns cname record that points to the aws elb which is another cname record pointing to two A records. When the aws elb cname record changes the two A records it is pointing at and kafka trys to reconnect to zk after losing a session it uses the old A records and not the new ones so the reconnect attempt fails. There appears to be some kind of caching instead of using the record that is set in the config file.
> This is the error message I am seeing in the broker logs.
> {code:java}
> [2018-04-30 20:09:21,449] INFO Opening socket connection to server ip-10-65-68-244.us-west-2.compute.internal/10.65.68.244:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2018-04-30 20:09:24,450] WARN Client session timed out, have not heard from server in 3962ms for sessionid 0x263094512190001 (org.apache.zookeeper.ClientCnxn)
> [2018-04-30 20:09:24,451] INFO Client session timed out, have not heard from server in 3962ms for sessionid 0x263094512190001, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> [2018-04-30 20:09:26,532] INFO Opening socket connection to server ip-10-65-84-102.us-west-2.compute.internal/10.65.84.102:2181. Will not attempt to authenticate using SASL (unknown error) (org.apache.zookeeper.ClientCnxn)
> [2018-04-30 20:09:29,531] WARN Session 0x263094512190001 for server null, unexpected error, closing socket connection and attempting reconnect (org.apache.zookeeper.ClientCnxn)
> java.net.NoRouteToHostException: No route to host
> at sun.nio.ch.SocketChannelImpl.checkConnect(Native Method)
> at sun.nio.ch.SocketChannelImpl.finishConnect(SocketChannelImpl.java:717)
> at org.apache.zookeeper.ClientCnxnSocketNIO.doTransport(ClientCnxnSocketNIO.java:361)
> at org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1141)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)