You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Pradeep Jawahar <pj...@groupon.com> on 2015/09/02 20:19:51 UTC
Recovery skipped after unclean shutdown
One of the brokers in our cluster had an unclean shutdown and after it was
restated I found the following logs.
$ grep "clean shutdown" /var/groupon/kafka/kafka-broker.log
02/Sep/2015 16:19:23 - warn::[Kafka Server 1], Proceeding to do an
unclean shutdown as all the controlled shutdown attempts failed
02/Sep/2015 16:22:06 - info::Found clean shutdown file. Skipping recovery
for all logs in data directory '/data/vol1'
02/Sep/2015 16:22:11 - info::Found clean shutdown file. Skipping recovery
for all logs in data directory '/data/vol2'
02/Sep/2015 16:22:15 - info::Found clean shutdown file. Skipping recovery
for all logs in data directory '/data/vol3'
02/Sep/2015 16:22:18 - info::Found clean shutdown file. Skipping recovery
for all logs in data directory '/data/vol4'
02/Sep/2015 16:22:22 - info::Found clean shutdown file. Skipping recovery
for all logs in data directory '/data/vol5'
02/Sep/2015 16:22:22 - info::Found clean shutdown file. Skipping recovery
for all logs in data directory '/data/vol6'
02/Sep/2015 16:22:26 - info::Found clean shutdown file. Skipping recovery
for all logs in data directory '/data/vol7'
02/Sep/2015 16:22:29 - info::Found clean shutdown file. Skipping recovery
for all logs in data directory '/data/vol8'
So no recovery happened and the partitions managed by this broker are not
catching up with the other replicas. I found that the ReplicaFetcher
threads for each of the partitions died.
Is anyone aware of how to get out of this situation. I was trying to locate
the shutdown file (may be it was left over from a previous run) and delete
it.
Additional Information
~~~~~~~~~~~~~~~~~
Kafka v 0.8.1.1
Centos 5
11 node cluster with replication factor 3
Disks are JBODs
Thanks,
Pradeep