You are viewing a plain text version of this content. The canonical link for it is here.
Posted to dev@kafka.apache.org by "Jay Kreps (JIRA)" <ji...@apache.org> on 2015/02/08 00:06:35 UTC

[jira] [Resolved] (KAFKA-1767) /admin/reassign_partitions deleted before reassignment completes

     [ https://issues.apache.org/jira/browse/KAFKA-1767?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel ]

Jay Kreps resolved KAFKA-1767.
------------------------------
    Resolution: Incomplete

No follow-up.

> /admin/reassign_partitions deleted before reassignment completes
> ----------------------------------------------------------------
>
>                 Key: KAFKA-1767
>                 URL: https://issues.apache.org/jira/browse/KAFKA-1767
>             Project: Kafka
>          Issue Type: Bug
>          Components: controller
>    Affects Versions: 0.8.1.1
>            Reporter: Ryan Berdeen
>            Assignee: Neha Narkhede
>
> https://github.com/apache/kafka/blob/0.8.1.1/core/src/main/scala/kafka/controller/KafkaController.scala#L477-L517 describes the process of reassigning partitions. Specifically,by the time {{/admin/reassign_partitions}} is updated, the newly assigned replicas (RAR) should be in sync, and the assigned replicas (AR) in ZooKeeper should be updated:
> {code}
> 4. Wait until all replicas in RAR are in sync with the leader.
> ...
> 10. Update AR in ZK with RAR.
> 11. Update the /admin/reassign_partitions path in ZK to remove this partition.
> {code}
> This worked in 0.8.1, but in 0.8.1.1 we observe {{/admin/reassign_partitions}} being removed before step 4 has completed.
> For example, if we have AR [1,2] and then put [3,4] in {{/admin/reassign_partitions}}, the cluster will end up with AR [1,2,3,4] and ISR [1,2] when the key is removed. Eventually, the AR will be updated to [3,4].
> This means that the {{kafka-reassign-partitions.sh}} tool will accept a new batch of reassignments before the current reassignments have finished, and our own tool that feeds in reassignments in small batches (see KAFKA-1677) can't rely on this key to detect active reassignments.
> Although we haven't observed this, it seems likely that if a controller resignation happens, the new controller won't know that a reassignment is in progress, and the AR will never be updated to the RAR.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)