You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Wes Chow <we...@chartbeat.com> on 2015/04/03 19:06:24 UTC
partition reassignment
I'm in the process of reassigning partitions away from failing machines
and it appears to be stuck. One thought is because our machines are
failing at a very high rate and so some partitions no longer have any
live replicas at all. At this point I don't care about the data, I just
want to get all partitions onto the set of machines that I know work. Is
there some way I can do this? I am happy to manipulate ZooKeeper and
bounce nodes if need be.
And a warning... this is due to Amazon EC2 d2 instance type failures. We
spun up 9 d2.xlarge instances and within a few hours 6 have failed under
a Kafka workload. So yeah, bleeding edge.
One thing I've done is rebuilt one of these nodes with the same broker
id and name but under a known working instance type. It came up and now
is spewing this in the logs:
[2015-04-03 13:05:30,275] 805497 [kafka-request-handler-0] WARN
kafka.server.KafkaApis - [KafkaApi-29] Produce request with correlation
id 5849 from client ping_partitioner on partition [pings,245] failed due
to Topic pings either doesn't exist or is in the process of being deleted
The topic most certainly should exist, however I'm guessing it's
complaining because there are no live replicas for that partition. Is
there some way to get it to just become leader?
Wes