You are viewing a plain text version of this content. The canonical link for it is here.
Posted to jira@kafka.apache.org by "Luke Chen (Jira)" <ji...@apache.org> on 2022/09/02 08:39:00 UTC

[jira] [Updated] (KAFKA-14197) Kraft broker fails to startup after topic creation failure

     [ https://issues.apache.org/jira/browse/KAFKA-14197?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Luke Chen updated KAFKA-14197:
------------------------------
    Description: 
In kraft ControllerWriteEvent, we start by trying to apply the record to controller in-memory state, then sent out the record via raft client. But if there is error during sending the records, there's no way to revert the change to controller in-memory state[1].

The issue happened when creating topics, controller state is updated with topic and partition metadata (ex: broker to ISR map), but the record doesn't send out successfully (ex: buffer allocation error). Then, when shutting down the node, the controlled shutdown will try to remove the broker from ISR by[2]:
{code:java}
generateLeaderAndIsrUpdates("enterControlledShutdown[" + brokerId + "]", brokerId, NO_LEADER, records, brokersToIsrs.partitionsWithBrokerInIsr(brokerId));{code}
 

After we appending the partitionChangeRecords, and send to metadata topic successfully, it'll cause the brokers failed to "replay" these partition change since these topic/partitions didn't get created successfully previously.

Even worse, after restarting the node, all the metadata records will replay again, and the same error happened again, cause the broker cannot start up successfully.

 

The error and call stack is like this, basically, it complains the topic image can't be found
{code:java}
[2022-09-02 16:29:16,334] ERROR Encountered metadata loading fault: Error replaying metadata log record at offset 81 (org.apache.kafka.server.fault.LoggingFaultHandler)
java.lang.NullPointerException
    at org.apache.kafka.image.TopicDelta.replay(TopicDelta.java:69)
    at org.apache.kafka.image.TopicsDelta.replay(TopicsDelta.java:91)
    at org.apache.kafka.image.MetadataDelta.replay(MetadataDelta.java:248)
    at org.apache.kafka.image.MetadataDelta.replay(MetadataDelta.java:186)
    at kafka.server.metadata.BrokerMetadataListener.$anonfun$loadBatches$3(BrokerMetadataListener.scala:239)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
    at kafka.server.metadata.BrokerMetadataListener.kafka$server$metadata$BrokerMetadataListener$$loadBatches(BrokerMetadataListener.scala:232)
    at kafka.server.metadata.BrokerMetadataListener$HandleCommitsEvent.run(BrokerMetadataListener.scala:113)
    at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
    at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
    at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
    at java.base/java.lang.Thread.run(Thread.java:829)
{code}
 

[1] [https://github.com/apache/kafka/blob/ef65b6e566ef69b2f9b58038c98a5993563d7a68/metadata/src/main/java/org/apache/kafka/controller/QuorumController.java#L779-L804] 

[2] [https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L1270]

  was:
In kraft ControllerWriteEvent, we start by trying to apply the record to controller in-memory state, then sent out the record via raft client. But if there is error during sending the records, there's no way to revert the change to controller in-memory state[1].

The issue happened when creating topics, controller state is updated with topic and partition metadata (ex: broker to ISR map), but the record doesn't send out successfully (i.e. buffer allocation error). Then, when shutting down the node, the controlled shutdown will try to remove the broker from ISR by[2]:
{code:java}
generateLeaderAndIsrUpdates("enterControlledShutdown[" + brokerId + "]", brokerId, NO_LEADER, records, brokersToIsrs.partitionsWithBrokerInIsr(brokerId));{code}
 

After we appending the partitionChangeRecords, and send to metadata topic successfully, it'll cause the brokers failed to "replay" these partition change since these topic/partitions didn't get created successfully previously.

Even worse, after restarting the node, all the metadata records will replay again, and the same error happened again, cause the broker cannot start up successfully.

 

The error and call stack is like this, basically, it complains the topic image can't be found
{code:java}
[2022-09-02 16:29:16,334] ERROR Encountered metadata loading fault: Error replaying metadata log record at offset 81 (org.apache.kafka.server.fault.LoggingFaultHandler)
java.lang.NullPointerException
    at org.apache.kafka.image.TopicDelta.replay(TopicDelta.java:69)
    at org.apache.kafka.image.TopicsDelta.replay(TopicsDelta.java:91)
    at org.apache.kafka.image.MetadataDelta.replay(MetadataDelta.java:248)
    at org.apache.kafka.image.MetadataDelta.replay(MetadataDelta.java:186)
    at kafka.server.metadata.BrokerMetadataListener.$anonfun$loadBatches$3(BrokerMetadataListener.scala:239)
    at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
    at kafka.server.metadata.BrokerMetadataListener.kafka$server$metadata$BrokerMetadataListener$$loadBatches(BrokerMetadataListener.scala:232)
    at kafka.server.metadata.BrokerMetadataListener$HandleCommitsEvent.run(BrokerMetadataListener.scala:113)
    at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
    at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
    at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
    at java.base/java.lang.Thread.run(Thread.java:829)
{code}
 

[1] https://github.com/apache/kafka/blob/ef65b6e566ef69b2f9b58038c98a5993563d7a68/metadata/src/main/java/org/apache/kafka/controller/QuorumController.java#L779-L804 

[2] https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L1270


> Kraft broker fails to startup after topic creation failure
> ----------------------------------------------------------
>
>                 Key: KAFKA-14197
>                 URL: https://issues.apache.org/jira/browse/KAFKA-14197
>             Project: Kafka
>          Issue Type: Bug
>          Components: kraft
>            Reporter: Luke Chen
>            Priority: Major
>
> In kraft ControllerWriteEvent, we start by trying to apply the record to controller in-memory state, then sent out the record via raft client. But if there is error during sending the records, there's no way to revert the change to controller in-memory state[1].
> The issue happened when creating topics, controller state is updated with topic and partition metadata (ex: broker to ISR map), but the record doesn't send out successfully (ex: buffer allocation error). Then, when shutting down the node, the controlled shutdown will try to remove the broker from ISR by[2]:
> {code:java}
> generateLeaderAndIsrUpdates("enterControlledShutdown[" + brokerId + "]", brokerId, NO_LEADER, records, brokersToIsrs.partitionsWithBrokerInIsr(brokerId));{code}
>  
> After we appending the partitionChangeRecords, and send to metadata topic successfully, it'll cause the brokers failed to "replay" these partition change since these topic/partitions didn't get created successfully previously.
> Even worse, after restarting the node, all the metadata records will replay again, and the same error happened again, cause the broker cannot start up successfully.
>  
> The error and call stack is like this, basically, it complains the topic image can't be found
> {code:java}
> [2022-09-02 16:29:16,334] ERROR Encountered metadata loading fault: Error replaying metadata log record at offset 81 (org.apache.kafka.server.fault.LoggingFaultHandler)
> java.lang.NullPointerException
>     at org.apache.kafka.image.TopicDelta.replay(TopicDelta.java:69)
>     at org.apache.kafka.image.TopicsDelta.replay(TopicsDelta.java:91)
>     at org.apache.kafka.image.MetadataDelta.replay(MetadataDelta.java:248)
>     at org.apache.kafka.image.MetadataDelta.replay(MetadataDelta.java:186)
>     at kafka.server.metadata.BrokerMetadataListener.$anonfun$loadBatches$3(BrokerMetadataListener.scala:239)
>     at java.base/java.util.ArrayList.forEach(ArrayList.java:1541)
>     at kafka.server.metadata.BrokerMetadataListener.kafka$server$metadata$BrokerMetadataListener$$loadBatches(BrokerMetadataListener.scala:232)
>     at kafka.server.metadata.BrokerMetadataListener$HandleCommitsEvent.run(BrokerMetadataListener.scala:113)
>     at org.apache.kafka.queue.KafkaEventQueue$EventContext.run(KafkaEventQueue.java:121)
>     at org.apache.kafka.queue.KafkaEventQueue$EventHandler.handleEvents(KafkaEventQueue.java:200)
>     at org.apache.kafka.queue.KafkaEventQueue$EventHandler.run(KafkaEventQueue.java:173)
>     at java.base/java.lang.Thread.run(Thread.java:829)
> {code}
>  
> [1] [https://github.com/apache/kafka/blob/ef65b6e566ef69b2f9b58038c98a5993563d7a68/metadata/src/main/java/org/apache/kafka/controller/QuorumController.java#L779-L804] 
> [2] [https://github.com/apache/kafka/blob/trunk/metadata/src/main/java/org/apache/kafka/controller/ReplicationControlManager.java#L1270]



--
This message was sent by Atlassian Jira
(v8.20.10#820010)