You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Nick Travers <n....@gmail.com> on 2017/01/20 00:50:17 UTC

Reassigning partitions to a non-existent broker

We recently tried to rebalance partitions for a topic (via
kafka.admin.ReassignPartitionsCommand).

In the .json file with the desired end-state, an id for a non-existent
broker
was entered for a partition. Upon --execute, all partitions were moved
without
issue, but the partition with the bogus broker ID was left stuck with an
expanded replica set, and the repartition has failed to finish. This is
presumably because the cluster is waiting indefinitely for this new broker
to
become available so the partition can be replicated there.

Any thoughts on how to remove this bogus broker ID for this partition? Has
anyone come up against this before? This is blocking us from being able to
run
additional repartitioning tasks on this cluster.

Also - it seems like a repartitioning job should fail to execute if a broker
id that does not exist is specified. That would have prevented this
particular
issue.

Thanks in advance!
- nick

Re: Reassigning partitions to a non-existent broker

Posted by Nick Travers <n....@gmail.com>.
Looping back on this for posterity. In case anyone else runs into this, the
solution was as follows:
- add a new node with the bogus broker ID
- let the cluster equilibrate / expand ISR sets
- move any partitions that have been assigned to this broker to the other
(original) brokers in the cluster (with the reassign-partitions script)
- decommission the broker

Opened KAFKA-4681 to track this issue.

On Thu, Jan 19, 2017 at 4:50 PM, Nick Travers <n....@gmail.com> wrote:

> We recently tried to rebalance partitions for a topic (via
> kafka.admin.ReassignPartitionsCommand).
>
> In the .json file with the desired end-state, an id for a non-existent
> broker
> was entered for a partition. Upon --execute, all partitions were moved
> without
> issue, but the partition with the bogus broker ID was left stuck with an
> expanded replica set, and the repartition has failed to finish. This is
> presumably because the cluster is waiting indefinitely for this new broker
> to
> become available so the partition can be replicated there.
>
> Any thoughts on how to remove this bogus broker ID for this partition? Has
> anyone come up against this before? This is blocking us from being able to
> run
> additional repartitioning tasks on this cluster.
>
> Also - it seems like a repartitioning job should fail to execute if a
> broker
> id that does not exist is specified. That would have prevented this
> particular
> issue.
>
> Thanks in advance!
> - nick
>