You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Yury Ruchin <yu...@gmail.com> on 2015/06/11 10:05:02 UTC

Kafka 0.8.1.1 arbitrarily increased replication factor

Hello,

I've run into a weird situation with Kafka 0.8.1.1. I had an operating
cluster which I wanted to extend with new brokers. The sequence was as
follows:

1. I added the brokers to cluster and ensured that they appeared under
/brokers/ids.

2. Ran reassign-partitions tool to redistribute the data and load evenly
between all the brokers. There was 2 topics with 200 partitions each, both
having replication factor 3.

3. Data transfer between replicas was too slow, so I decided to increase
num.replica.fetchers from 1 to 4 to speed up the process. I adjusted
brokers configuration and began rolling restart on broker at a time. Over
the course of restarts I noticed lots of errors in the logs, such as "topic
is in the process of being deleted" (which obviously didn't hold true) and
"incorrect LeaderAndIsr received". Had no idea what to do about them, so
repeated restart for some brokers.

4. Waited for a while so that replicas caught up

5. Ran preferred-replica-election and finished the process.

Observations. When I ran kafka-topics.sh --list during the reassignment, I
saw more than 3 replicas for some partitions in the "Replicas" field. I
assumed this is expected, since a partition might be assigned to a
completely different set of replicas which did not overlap with the
original replicas. Bad thing is that this situation have not changed till
now. I still see 4-6 replicas in "Replicas" and "ISR" for many partitions,
even when kafka-topics.sh --under-replicated does not show anything. What
is worse, the kafka-topics.sh --describe shows "Replication factor" changed
to 5 for the one topic, and 6 for the other!

I wonder how it might happen that replication factor was increased by Kafka
in this way. Any idea on how I can get my topics back to replication factor
3 is appreciated.