You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Neha Narkhede (JIRA)" <ji...@apache.org> on 2013/10/22 00:02:43 UTC

[jira] [Created] (KAFKA-1097) Race condition while reassigning partition leads to incorrect ISR information in zookeeper

Neha Narkhede created KAFKA-1097:
------------------------------------

             Summary: Race condition while reassigning partition leads to incorrect ISR information in zookeeper 
                 Key: KAFKA-1097
                 URL: https://issues.apache.org/jira/browse/KAFKA-1097
             Project: Kafka
          Issue Type: Bug
          Components: controller
    Affects Versions: 0.8
            Reporter: Neha Narkhede
            Assignee: Neha Narkhede
            Priority: Critical


While moving partitions, the controller moves the old replicas through the following state changes -

ONLINE -> OFFLINE -> NON_EXISTENT

During the offline state change, the controller removes the old replica and writes the updated ISR to zookeeper and notifies the leader. Note that it doesn't notify the old replicas to stop fetching from the leader (to be fixed in KAFKA-1032). During the non-existent state change, the controller does not write the updated ISR or replica list to zookeeper. Right after the non-existent state change, the controller writes the new replica list to zookeeper, but does not update the ISR. So an old replica can send a fetch request after the offline state change, essentially letting the leader add it back to the ISR. That lets a non existent replica live in the ISR



--
This message was sent by Atlassian JIRA
(v6.1#6144)