You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Jun Rao (Jira)" <ji...@apache.org> on 2022/02/24 22:14:00 UTC

[jira] [Commented] (KAFKA-13461) KafkaController stops functioning as active controller after ZooKeeperClient auth failure

    [ https://issues.apache.org/jira/browse/KAFKA-13461?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17497763#comment-17497763 ] 

Jun Rao commented on KAFKA-13461:
---------------------------------

Basically, when there is no JAAS configured for ZK client and the ZK client tries to establish a new connection, the client will first receive an AUTH_FAIL event. However, this doesn't mean that the ZK client's session is gone since the client will retry the connection without auth, which typically succeeds. Previously, we mistakenly try to reinitialize the controller with the AUTH_FAIL event, which causes the controller to resign but not regain the controllership since the underlying session is still valid.

> KafkaController stops functioning as active controller after ZooKeeperClient auth failure
> -----------------------------------------------------------------------------------------
>
>                 Key: KAFKA-13461
>                 URL: https://issues.apache.org/jira/browse/KAFKA-13461
>             Project: Kafka
>          Issue Type: Bug
>          Components: zkclient
>            Reporter: Vincent Jiang
>            Assignee: Rajini Sivaram
>            Priority: Major
>             Fix For: 3.1.0, 3.0.1
>
>
> When java.security.auth.login.config is present, but there is no "Client" section,  ZookeeperSaslClient creation fails and raises LoginExcpetion, result in warning log:
> {code:java}
> WARN SASL configuration failed: javax.security.auth.login.LoginException: No JAAS configuration section named 'Client' was found in specified JAAS configuration file: '***'. Will continue connection to Zookeeper server without SASL authentication, if Zookeeper server allows it.{code}
> When this happens after initial startup, ClientCnxn enqueues an AuthFailed event which will trigger following sequence:
>  # zkclient reinitialization is triggered
>  # the controller resigns.
>  # Before the controller's ZK session expires, the controller successfully connect to ZK and maintains the current session
>  # In KafkaController.elect(), the controller sets activeControllerId to itself and short-circuits the rest of the elect. Since the controller resigned earlier and also skips the call to onControllerFailover(), the controller is not actually functioning as the active controller (e.g. the necessary ZK watchers haven't been registered).
>  



--
This message was sent by Atlassian Jira
(v8.20.1#820001)