You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Dong Lin (JIRA)" <ji...@apache.org> on 2018/10/02 14:30:00 UTC

[jira] [Updated] (KAFKA-6438) NSEE while concurrently creating and deleting a topic

     [ https://issues.apache.org/jira/browse/KAFKA-6438?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Dong Lin updated KAFKA-6438:
----------------------------
    Fix Version/s:     (was: 2.1.0)

> NSEE while concurrently creating and deleting a topic
> -----------------------------------------------------
>
>                 Key: KAFKA-6438
>                 URL: https://issues.apache.org/jira/browse/KAFKA-6438
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 1.0.0
>         Environment: kafka_2.11-1.0.0.jar
> OpenJDK Runtime Environment (build 1.8.0_102-b14), OpenJDK 64-Bit Server VM (build 25.102-b14, mixed mode)
> CentOS Linux release 7.3.1611 (Core)
>            Reporter: Adam Kotwasinski
>            Priority: Major
>              Labels: reliability
>
> It appears that deleting a topic and creating it at the same time can cause NSEE, what later results in a forced controller shutdown.
> Most probably topics are being created because consumers/producers are still active (yes, this means the deletion is happening blindly).
> The main problem here (for me) is the controller switch, the data loss and following unclean election is acceptable (as we admit to deleting blindly).
> Environment description:
> 20 kafka brokers
> 80k partitions (20k topics 4partitions each)
> 3 node ZK
> Incident:
> {code:java}
> [2018-01-09 11:19:05,912] INFO [Topic Deletion Manager 6], Partition deletion callback for mytopic-2,mytopic-0,mytopic-1,mytopic-3 (kafka.controller.TopicDeletionManager)
> [2018-01-09 11:19:06,237] INFO [Controller id=6] New leader and ISR for partition mytopic-0 is {"leader":-1,"leader_epoch":1,"isr":[]} (kafka.controller.KafkaController)
> [2018-01-09 11:19:06,412] INFO [Topic Deletion Manager 6], Deletion for replicas 12,9,10,11 for partition mytopic-3,mytopic-0,mytopic-1,mytopic-2 of topic mytopic in progress (kafka.controller.TopicDeletionManager)
> [2018-01-09 11:19:07,218] INFO [Topic Deletion Manager 6], Deletion for replicas 12,9,10,11 for partition mytopic-3,mytopic-0,mytopic-1,mytopic-2 of topic mytopic in progress (kafka.controller.TopicDeletionManager)
> [2018-01-09 11:19:07,304] INFO [Topic Deletion Manager 6], Deletion for replicas 12,9,10,11 for partition mytopic-3,mytopic-0,mytopic-1,mytopic-2 of topic mytopic in progress (kafka.controller.TopicDeletionManager)
> [2018-01-09 11:19:07,383] INFO [Topic Deletion Manager 6], Deletion for replicas 12,9,10,11 for partition mytopic-3,mytopic-0,mytopic-1,mytopic-2 of topic mytopic in progress (kafka.controller.TopicDeletionManager)
> [2018-01-09 11:19:07,510] INFO [Topic Deletion Manager 6], Deletion for replicas 12,9,10,11 for partition mytopic-3,mytopic-0,mytopic-1,mytopic-2 of topic mytopic in progress (kafka.controller.TopicDeletionManager)
> [2018-01-09 11:19:07,661] INFO [Topic Deletion Manager 6], Deletion for replicas 12,9,10,11 for partition mytopic-3,mytopic-0,mytopic-1,mytopic-2 of topic mytopic in progress (kafka.controller.TopicDeletionManager)
> [2018-01-09 11:19:07,728] INFO [Topic Deletion Manager 6], Deletion for replicas 9,10,11 for partition mytopic-0,mytopic-1,mytopic-2 of topic mytopic in progress (kafka.controller.TopicDeletionManager)
> [2018-01-09 11:19:07,924] INFO [PartitionStateMachine controllerId=6] Invoking state change to OfflinePartition for partitions mytopic-2,mytopic-0,mytopic-1,mytopic-3 (kafka.controller.PartitionStateMachine)
> [2018-01-09 11:19:07,924] INFO [PartitionStateMachine controllerId=6] Invoking state change to NonExistentPartition for partitions mytopic-2,mytopic-0,mytopic-1,mytopic-3 (kafka.controller.PartitionStateMachine)
> [2018-01-09 11:19:08,592] INFO [Controller id=6] New topics: [Set(mytopic, other, other2)], deleted topics: [Set()], new partition replica assignment [Map(other-0 -> Vector(8), mytopic-2 -> Vector(6), mytopic-0 -> Vector(4), other-2 -> Vector(10), mytopic-1 -> Vector(5), mytopic-3 -> Vector(7), other-1 -> Vector(9), other-3 -> Vector(11))] (kafka.controller.KafkaController)
> [2018-01-09 11:19:08,593] INFO [Controller id=6] New topic creation callback for other-0,mytopic-2,mytopic-0,other-2,mytopic-1,mytopic-3,other-1,other-3 (kafka.controller.KafkaController)
> [2018-01-09 11:19:08,596] INFO [Controller id=6] New partition creation callback for other-0,mytopic-2,mytopic-0,other-2,mytopic-1,mytopic-3,other-1,other-3 (kafka.controller.KafkaController)
> [2018-01-09 11:19:08,596] INFO [PartitionStateMachine controllerId=6] Invoking state change to NewPartition for partitions other-0,mytopic-2,mytopic-0,other-2,mytopic-1,mytopic-3,other-1,other-3 (kafka.controller.PartitionStateMachine)
> [2018-01-09 11:19:08,642] INFO [PartitionStateMachine controllerId=6] Invoking state change to OnlinePartition for partitions other-0,mytopic-2,mytopic-0,other-2,mytopic-1,mytopic-3,other-1,other-3 (kafka.controller.PartitionStateMachine)
> [2018-01-09 11:19:08,828] INFO [Topic Deletion Manager 6], Partition deletion callback for mytopic-2,mytopic-0,mytopic-1,mytopic-3 (kafka.controller.TopicDeletionManager)
> [2018-01-09 11:19:09,127] INFO [Controller id=6] New leader and ISR for partition mytopic-0 is {"leader":-1,"leader_epoch":1,"isr":[]} (kafka.controller.KafkaController)
> [2018-01-09 11:19:09,607] ERROR [controller-event-thread]: Error processing event TopicDeletion(Set(mytopic, other)) (kafka.controller.Contr
> ollerEventManager$ControllerEventThread)
> java.util.NoSuchElementException: key not found: mytopic-0
>         at scala.collection.MapLike$class.default(MapLike.scala:228)
>         at scala.collection.AbstractMap.default(Map.scala:59)
>         at scala.collection.mutable.HashMap.apply(HashMap.scala:65)
>         at kafka.controller.ControllerBrokerRequestBatch.kafka$controller$ControllerBrokerRequestBatch$$updateMetadataRequestPartitionInfo$1(ControllerChannelManager.scala:358)
>         at kafka.controller.ControllerBrokerRequestBatch$$anonfun$addUpdateMetadataRequestForBrokers$3.apply(ControllerChannelManager.scala:394)
>         at kafka.controller.ControllerBrokerRequestBatch$$anonfun$addUpdateMetadataRequestForBrokers$3.apply(ControllerChannelManager.scala:394)
>         at scala.collection.mutable.HashSet.foreach(HashSet.scala:78)
>         at kafka.controller.ControllerBrokerRequestBatch.addUpdateMetadataRequestForBrokers(ControllerChannelManager.scala:394)
>         at kafka.controller.KafkaController.sendUpdateMetadataRequest(KafkaController.scala:1039)
>         at kafka.controller.TopicDeletionManager.kafka$controller$TopicDeletionManager$$onTopicDeletion(TopicDeletionManager.scala:266)
>         at kafka.controller.TopicDeletionManager$$anonfun$resumeDeletions$2.apply(TopicDeletionManager.scala:361)
>         at kafka.controller.TopicDeletionManager$$anonfun$resumeDeletions$2.apply(TopicDeletionManager.scala:333)
>         at scala.collection.immutable.Set$Set3.foreach(Set.scala:163)
>         at kafka.controller.TopicDeletionManager.resumeDeletions(TopicDeletionManager.scala:333)
>         at kafka.controller.TopicDeletionManager.enqueueTopicsForDeletion(TopicDeletionManager.scala:113)
>         at kafka.controller.KafkaController$TopicDeletion.process(KafkaController.scala:1299)
>         at kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply$mcV$sp(ControllerEventManager.scala:53)
>         at kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:53)
>         at kafka.controller.ControllerEventManager$ControllerEventThread$$anonfun$doWork$1.apply(ControllerEventManager.scala:53)
>         at kafka.metrics.KafkaTimer.time(KafkaTimer.scala:31)
>         at kafka.controller.ControllerEventManager$ControllerEventThread.doWork(ControllerEventManager.scala:52)
>         at kafka.utils.ShutdownableThread.run(ShutdownableThread.scala:64)
> [2018-01-09 11:19:09,657] INFO [Controller id=6] New topics: [Set(mytopic)], deleted topics: [Set()], new partition replica assignment [Map(mytopic-2 -> Vector(6), mytopic-0 -> Vector(4), mytopic-1 -> Vector(5), mytopic-3 -> Vector(7))] (kafka.controller.KafkaController)
> [2018-01-09 11:19:09,658] INFO [Controller id=6] New topic creation callback for mytopic-2,mytopic-0,mytopic-1,mytopic-3 (kafka.controller.KafkaController)
> [2018-01-09 11:19:09,658] INFO [Controller id=6] New partition creation callback for mytopic-2,mytopic-0,mytopic-1,mytopic-3 (kafka.controller.KafkaController)
> [2018-01-09 11:19:09,658] INFO [PartitionStateMachine controllerId=6] Invoking state change to NewPartition for partitions mytopic-2,mytopic-0,mytopic-1,mytopic-3 (kafka.controller.PartitionStateMachine)
> [2018-01-09 11:19:09,659] INFO [PartitionStateMachine controllerId=6] Invoking state change to OnlinePartition for partitions mytopic-2,mytopic-0,mytopic-1,mytopic-3 (kafka.controller.PartitionStateMachine)
> [2018-01-09 11:19:09,664] WARN [OfflinePartitionLeaderSelector]: No broker in ISR is alive for mytopic-0. Elect leader 4 from live brokers 4. There's potential data loss. (kafka.controller.OfflinePartitionLeaderSelector)
> [2018-01-09 11:19:09,664] INFO [OfflinePartitionLeaderSelector]: Selected new leader and ISR {"leader":4,"leader_epoch":2,"isr":[4]} for offline partition mytopic-0 (kafka.controller.OfflinePartitionLeaderSelector)
> [2018-01-09 11:19:09,769] ERROR [Controller id=6] Forcing the controller to resign (kafka.controller.KafkaController)
> {code}



--
This message was sent by Atlassian JIRA
(v7.6.3#76005)