You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Jun Rao (JIRA)" <ji...@apache.org> on 2019/02/22 20:19:00 UTC
[jira] [Commented] (KAFKA-7987) a broker's ZK session may die on
transient auth failure
[ https://issues.apache.org/jira/browse/KAFKA-7987?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=16775565#comment-16775565 ]
Jun Rao commented on KAFKA-7987:
--------------------------------
One potential way to fix this is to handle auth failure in ZooKeeperClient in the same way as session expiration by constantly retrying establishing the connection until success.
> a broker's ZK session may die on transient auth failure
> -------------------------------------------------------
>
> Key: KAFKA-7987
> URL: https://issues.apache.org/jira/browse/KAFKA-7987
> Project: Kafka
> Issue Type: Improvement
> Reporter: Jun Rao
> Priority: Major
>
> After a transient network issue, we saw the following log in a broker.
> {code:java}
> [23:37:02,102] ERROR SASL authentication with Zookeeper Quorum member failed: javax.security.sasl.SaslException: An error: (java.security.PrivilegedActionException: javax.security.sasl.SaslException: GSS initiate failed [Caused by GSSException: No valid credentials provided (Mechanism level: Server not found in Kerberos database (7))]) occurred when evaluating Zookeeper Quorum Member's received SASL token. Zookeeper Client will go to AUTH_FAILED state. (org.apache.zookeeper.ClientCnxn)
> [23:37:02,102] ERROR [ZooKeeperClient] Auth failed. (kafka.zookeeper.ZooKeeperClient)
> {code}
> The network issue prevented the broker from communicating to ZK. The broker's ZK session then expired, but the broker didn't know that yet since it couldn't establish a connection to ZK. When the network was back, the broker tried to establish a connection to ZK, but failed due to auth failure (likely due to a transient KDC issue). The current logic just ignores the auth failure without trying to create a new ZK session. Then the broker will be permanently in a state that it's alive, but not registered in ZK.
>
--
This message was sent by Atlassian JIRA
(v7.6.3#76005)