You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Neha Narkhede (JIRA)" <ji...@apache.org> on 2013/10/22 23:07:44 UTC

[jira] [Comment Edited] (KAFKA-1097) Race condition while reassigning low throughput partition leads to incorrect ISR information in zookeeper

    [ https://issues.apache.org/jira/browse/KAFKA-1097?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13802251#comment-13802251 ] 

Neha Narkhede edited comment on KAFKA-1097 at 10/22/13 9:07 PM:
----------------------------------------------------------------

[~sriramsub] Agree that it is sort of corner case. A potential risk is that while a partition is in this state (with no new data coming in) and the leader dies or needs to be bounced, there is a risk that the controller might try to elect a non existent replica as the leader. Since we don't retry leader elections based on response from the broker, this might be a potential issue.

On the other hand, there is a workaround. The admin can patch the ISR manually in zookeeper and run the preferred replica election which makes the controller send the right ISR and assigned replica list to the leader, which will also fix the leader's internal data structures that hold the ISR.


was (Author: nehanarkhede):
[~sriramsub] Agree that it is sort of corner case. A potential risk is that while a partition is in this state (with no new data coming in) and the leader dies or needs to be bounced, there is a risk that the controller might try to elect a non existent replica as the leader. Since we don't retry leader elections based on response from the broker, this might be a potential issue.

> Race condition while reassigning low throughput partition leads to incorrect ISR information in zookeeper 
> ----------------------------------------------------------------------------------------------------------
>
>                 Key: KAFKA-1097
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1097
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8
>            Reporter: Neha Narkhede
>            Assignee: Neha Narkhede
>            Priority: Critical
>             Fix For: 0.8
>
>
> While moving partitions, the controller moves the old replicas through the following state changes -
> ONLINE -> OFFLINE -> NON_EXISTENT
> During the offline state change, the controller removes the old replica and writes the updated ISR to zookeeper and notifies the leader. Note that it doesn't notify the old replicas to stop fetching from the leader (to be fixed in KAFKA-1032). During the non-existent state change, the controller does not write the updated ISR or replica list to zookeeper. Right after the non-existent state change, the controller writes the new replica list to zookeeper, but does not update the ISR. So an old replica can send a fetch request after the offline state change, essentially letting the leader add it back to the ISR. The problem is that if there is no new data coming in for the partition and the old replica is fully caught up, the leader cannot remove it from the ISR. That lets a non existent replica live in the ISR at least until new data comes in to the partition



--
This message was sent by Atlassian JIRA
(v6.1#6144)