You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Jagadish (Jira)" <ji...@apache.org> on 2020/04/08 10:13:00 UTC

[jira] [Commented] (KAFKA-9836) org.apache.zookeeper.KeeperException$SessionMovedException: KeeperErrorCode = Session moved for /controller_epoch

    [ https://issues.apache.org/jira/browse/KAFKA-9836?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17078024#comment-17078024 ] 

Jagadish commented on KAFKA-9836:
---------------------------------

Need help in understand what made it get to these errors and how to rectify and prevent this issue from happening with out impacting the data in the topics.

> org.apache.zookeeper.KeeperException$SessionMovedException: KeeperErrorCode = Session moved for /controller_epoch
> -----------------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-9836
>                 URL: https://issues.apache.org/jira/browse/KAFKA-9836
>             Project: Kafka
>          Issue Type: Bug
>          Components: admin
>    Affects Versions: 2.3.0
>            Reporter: Jagadish
>            Priority: Critical
>             Fix For: 2.3.0
>
>
> We have 3 node kafka cluster on RHEL.
> We are getting following WARN messages on 2 nodes when using console conusmer/ console producer 
> +Consumer Warning+
> [2020-04-08 06:05:02,356] WARN [Consumer clientId=consumer-1, groupId=console-consumer-5952] 1 partitions have leader brokers without a matching listener, including [DR27SAL_S_EVT_ACT-0] (org.apache.kafka.clients.NetworkClient)
> +Producer Warning+
> [2020-04-08 06:06:39,177] WARN [Producer clientId=console-producer] 2 partitions have leader brokers without a matching listener, including [FirstConsoleTopic-5, FirstConsoleTopic-2] (org.apache.kafka.clients.NetworkClient)
> )
>  
> +Got the following errors in one of our Kafka borker's Stage Change log+
>  [2020-04-06 20:08:03,063] TRACE [Controller id=2 epoch=21] Received response \{error_code=0} for request UPDATE_METADATA with correlation
> id 2 sent to broker cghcts000000946.corporate.ge.com:9092 (id: 2 rack: null) (state.change.logger)
> [2020-04-06 20:08:03,120] ERROR [Controller id=2 epoch=21] Controller 2 epoch 21 initiated state change of replica 1 for partition __cons
>  umer_offsets-22 from ReplicaDeletionIneligible to OfflineReplica failed (state.change.logger)
>  org.apache.zookeeper.KeeperException$SessionMovedException: KeeperErrorCode = Session moved for /controller_epoch
>  at org.apache.zookeeper.KeeperException.create(KeeperException.java:134)
>  at org.apache.zookeeper.KeeperException.create(KeeperException.java:54)
>  at kafka.zk.KafkaZkClient$.kafka$zk$KafkaZkClient$$unwrapResponseWithControllerEpochCheck(KafkaZkClient.scala:1864)
>  at kafka.zk.KafkaZkClient.$anonfun$retryRequestsUntilConnected$2(KafkaZkClient.scala:1650)
>  at scala.collection.TraversableLike.$anonfun$map$1(TraversableLike.scala:237)
>  at scala.collection.mutable.ResizableArray.foreach(ResizableArray.scala:62)
>  at scala.collection.mutable.ResizableArray.foreach$(ResizableArray.scala:55)
>  at scala.collection.mutable.ArrayBuffer.foreach(ArrayBuffer.scala:49)
>  at scala.collection.TraversableLike.map(TraversableLike.scala:237)
>  at scala.collection.TraversableLike.map$(TraversableLike.scala:230)
>  at scala.collection.AbstractTraversable.map(Traversable.scala:108)
>  at kafka.zk.KafkaZkClient.retryRequestsUntilConnected(KafkaZkClient.scala:1650)
>  at kafka.zk.KafkaZkClient.setTopicPartitionStatesRaw(KafkaZkClient.scala:204)
>  at kafka.zk.KafkaZkClient.updateLeaderAndIsr(KafkaZkClient.scala:261)
>  at kafka.controller.ZkReplicaStateMachine.doRemoveReplicasFromIsr(ReplicaStateMachine.scala:318)
>  at kafka.controller.ZkReplicaStateMachine.removeReplicasFromIsr(ReplicaStateMachine.scala:282)
>  at kafka.controller.ZkReplicaStateMachine.doHandleStateChanges(ReplicaStateMachine.scala:219)
>  at kafka.controller.ZkReplicaStateMachine.$anonfun$handleStateChanges$2(ReplicaStateMachine.scala:111)
>  at kafka.controller.ZkReplicaStateMachine.$anonfun$handleStateChanges$2$adapted(ReplicaStateMachine.scala:110)
>  at scala.collection.immutable.Map$Map1.foreach(Map.scala:128)
>  at kafka.controller.ZkReplicaStateMachine.handleStateChanges(ReplicaStateMachine.scala:110)
>  at kafka.controller.ReplicaStateMachine.startup(ReplicaStateMachine.scala:42)
>  at kafka.controller.KafkaController.onControllerFailover(KafkaController.scala:268)
>  at kafka.controller.KafkaController.elect(KafkaController.scala:1226)
>  at kafka.controller.KafkaController.processReelect(KafkaController.scala:1543)
>  at kafka.controller.KafkaController.process(KafkaController.scala:1584)
>  at kafka.controller.QueuedEvent.process(ControllerEventManager.scala:53)
>  at kafka.controller.ControllerEventManager$ControllerEventThread.$anonfun$doWork$1(ControllerEventManager.scala:137)
>  at scala.runtime.java8.JFunction0$mcV$sp.apply(JFunction0$mcV$sp.java:23)
>  at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
>  at kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:137)
>  at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:89)
>  



--
This message was sent by Atlassian Jira
(v8.3.4#803005)