You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Michał Woś (JIRA)" <ji...@apache.org> on 2016/12/13 02:03:59 UTC

[jira] [Issue Comment Deleted] (KAFKA-1407) Broker can not return to ISR because of BadVersionException

     [ https://issues.apache.org/jira/browse/KAFKA-1407?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Michał Woś updated KAFKA-1407:
------------------------------
    Comment: was deleted

(was: [~dmitrybugaychenko] +1
The same issue in 0.8.2.0. Scenario
1. Broker looses session from zookeeper (because some reasons - nvm here)
2. Brokers and zookeeper establish new session and this happens:
{noformat}
2016-12-10 10:28:22,924 INFO org.apache.zookeeper.ClientCnxn: Unable to reconnect to ZooKeeper service, session 0x25453ff793e91ed has expired, closing socket connection
2016-12-10 10:28:22,924 INFO org.I0Itec.zkclient.ZkClient: zookeeper state changed (Expired)
2016-12-10 10:28:22,924 INFO org.apache.zookeeper.ZooKeeper: Initiating client connection, connectString=machine_address2, machine_address3, machine_address4 sessionTimeout=15000 watcher=org.I0Itec.zkclient.ZkClient@145eaa29
2016-12-10 10:28:22,936 INFO org.apache.zookeeper.ClientCnxn: EventThread shut down
2016-12-10 10:28:22,938 INFO org.apache.zookeeper.ClientCnxn: Opening socket connection to server machine_address1. Will not attempt to authenticate using SASL (unknown error)
2016-12-10 10:28:22,940 INFO org.apache.zookeeper.ClientCnxn: Socket connection established to machine_address1, initiating session
2016-12-10 10:28:22,947 INFO org.apache.zookeeper.ClientCnxn: Session establishment complete on server machine_address1, sessionid = 0x65453fead67efb4, negotiated timeout = 15000
2016-12-10 10:28:22,947 INFO org.I0Itec.zkclient.ZkClient: zookeeper state changed (SyncConnected)

2016-12-10 10:28:23,014 INFO kafka.log.OffsetIndex: Deleting index /home/amy/data/kafka/topic_name-0/00000000034872197725.index.deleted
2016-12-10 10:28:23,040 INFO kafka.log.OffsetIndex: Deleting index /home/amy/data/kafka/topic_name-0/00000000034871445334.index.deleted
2016-12-10 10:28:23,183 INFO kafka.log.OffsetIndex: Deleting index /home/amy/data/kafka/topic_name-0/00000000034882006637.index.deleted
2016-12-10 10:28:23,188 INFO kafka.log.OffsetIndex: Deleting index /home/amy/data/kafka/topic_name-0/00000000034875965334.index.deleted

2016-12-10 10:28:23,277 INFO kafka.controller.KafkaController$SessionExpirationListener: [SessionExpirationListener on 2110], ZK expired; shut down all controller components and try to re-elect
2016-12-10 10:28:23,319 ERROR org.I0Itec.zkclient.ZkEventThread: Error handling event ZkEvent[New session event sent to kafka.controller.KafkaController$SessionExpirationListener@70ee82ce]
java.lang.IllegalStateException: Kafka scheduler has not been started
        at kafka.utils.KafkaScheduler.ensureStarted(KafkaScheduler.scala:114)
        at kafka.utils.KafkaScheduler.shutdown(KafkaScheduler.scala:86)
        at kafka.controller.KafkaController.onControllerResignation(KafkaController.scala:350)
        at kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSessionapply$mcZ$sp(KafkaController.scala:1108)
        at kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSessionapply(KafkaController.scala:1107)
        at kafka.controller.KafkaController$SessionExpirationListener$$anonfun$handleNewSessionapply(KafkaController.scala:1107)
        at kafka.utils.Utils$.inLock(Utils.scala:561)
        at kafka.controller.KafkaController$SessionExpirationListener.handleNewSession(KafkaController.scala:1107)
        at org.I0Itec.zkclient.ZkClientrun(ZkClient.java:472)
        at org.I0Itec.zkclient.ZkEventThread.run(ZkEventThread.java:71)
2016-12-10 10:28:23,320 INFO kafka.server.KafkaHealthcheck: re-registering broker info in ZK for broker 2110
2016-12-10 10:28:23,367 INFO kafka.utils.ZkUtils$: Registered broker 2110 at path /brokers/ids/2110 with address machine_address.
2016-12-10 10:28:23,370 INFO kafka.server.KafkaHealthcheck: done re-registering broker
2016-12-10 10:28:23,380 INFO kafka.server.KafkaHealthcheck: Subscribing to /brokers/topics path to watch for new topics
2016-12-10 10:28:23,456 INFO kafka.server.ZookeeperLeaderElector$LeaderChangeListener: New leader is 828
2016-12-10 10:28:24,820 INFO kafka.log.Log: Rolled new log segment for 'Dispatcher.client_calls-0' in 1 ms.
{noformat}
Meantime, after session is established, my producer gets for each messege:
{noformat}
org.apache.kafka.common.errors.NotLeaderForPartitionException: This server is not the leader for that topic-partition
and messages are lost.
{noformat})

> Broker can not return to ISR because of BadVersionException
> -----------------------------------------------------------
>
>                 Key: KAFKA-1407
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1407
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8.1
>            Reporter: Dmitry Bugaychenko
>            Assignee: Neha Narkhede
>
> Each morning we found a broker out of ISR at stuck with log full of messages:
> {code}
> INFO   | jvm 1    | 2014/04/21 08:36:21 | [2014-04-21 09:36:21,907] ERROR Conditional update of path /brokers/topics/topic2/partitions/1/state with data {"controller_epoch":46,"leader":2,"version":1,"leader_epoch":38,"isr":[2]} and expected version 53 failed due to org.apache.zookeeper.KeeperException$BadVersionException: KeeperErrorCode = BadVersion for /brokers/topics/topic2/partitions/1/state (kafka.utils.ZkUtils$)
> INFO   | jvm 1    | 2014/04/21 08:36:21 | [2014-04-21 09:36:21,907] INFO Partition [topic2,1] on broker 2: Cached zkVersion [53] not equal to that in zookeeper, skip updating ISR (kafka.cluster.Partition)
> {code}
> It seems that it can not recover after short netwrok break down and the only way to return it is restart it using kill -9.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)