You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Guozhang Wang (JIRA)" <ji...@apache.org> on 2013/08/29 03:57:51 UTC

[jira] [Commented] (KAFKA-1032) Messages sent to the old leader will be lost on broker GC resulted failure

    [ https://issues.apache.org/jira/browse/KAFKA-1032?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13753180#comment-13753180 ] 

Guozhang Wang commented on KAFKA-1032:
--------------------------------------

Proposed approach:

1. Add addStopReplicaRequestForBrokers with deletion = false to handling replica state change to offline. Now this is only triggered by onBrokerFailure and stopOldReplicasOfReassignedPartition.

2. In shutdownBroker of KafkaController, remove the direct call

brokerRequestBatch.addStopReplicaRequestForBrokers(Seq(id), topicAndPartition.topic, topicAndPartition.partition, deletePartition = false)
                
> Messages sent to the old leader will be lost on broker GC resulted failure
> --------------------------------------------------------------------------
>
>                 Key: KAFKA-1032
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1032
>             Project: Kafka
>          Issue Type: Bug
>    Affects Versions: 0.8
>            Reporter: Guozhang Wang
>            Assignee: Guozhang Wang
>
> As pointed out by Swapnil, today when a broker in on long GC, it will marked by the controller as failed and trigger the onBrokerFailure function to migrate leadership to other brokers. However, since the Controller does not notify the broker with stopReplica request even after a new leader has been elected for its partitions. The new leader will hence stop fetching from the old leader while the old leader is not aware that he is no longer the leader. And since the old leader is not really dead producers will not refresh their metadata immediately and will continue sending messages to the old leader. The old leader will only know it is no longer the leader when it gets notified by controller in the onBrokerStartup function, and message sent starting from the time the new leader is elected to the timestamp the old leader realize it is no longer the leader will be lost.

--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira