You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by ashendra bansal <as...@gmail.com> on 2014/12/09 08:54:47 UTC

Kafka Issue [Corrupted broker]

Hi,

    One of the broker seems to have got corrupted in my cluster of 7
brokers. All the topic partitions where this broker was leader are having
NoLeader or UnderReplicated partition exceptions.

All these partittions have no leader and even no replica in the isr(in-sync
replica) set.

Corrupt broker id - 5.

topic: topic1 partition: 2 leader: -1 replicas: 5 isr:
topic: topic1 partition: 8 leader: -1 replicas: 5 isr:
topic: topic1 partition: 14 leader: -1 replicas: 5 isr:
topic: topic2 partition: 1 leader: -1 replicas: 5 isr:
topic: topic2 partition: 8 leader: -1 replicas: 5 isr:
topic: topic2 partition: 15 leader: -1 replicas: 5 isr:
topic: topic3 partition: 1 leader: -1 replicas: 5 isr:
topic: topic3 partition: 8 leader: -1 replicas: 5 isr:
topic: topic3 partition: 15 leader: -1 replicas: 5 isr:

I have tried the replication tools to manually assign broker to these
partitions but that did not helped. As none of them are in isr set.

Unfortunately the replication factor for these topics was 1. But for topics
where the replication factor was higher, the problem persist. There the
leader has been assigned to the next preferred replica but the replica on
corrupt broker is not moved to isr set even after long time(days) and
partitions have logs in order of 100s.

topic: topic4 partition: 1 leader: 6 replicas: 5,6 isr: 6

For same topic, the partition where leader was not broker 5(corrupted
broker) there broker 5 is still in isr set.

topic: topic4 partition: 0 leader: 4 replicas: 4,5 isr: 4,5

Another observation, the corrupted broker has topic creation log in its
INFO logs, printed very frequently, every minute

[2014-12-09 13:07:27,878] INFO Topic creation { "partitions":{ "0":[ 4, 3
], "1":[ 5, 4 ] }, "version":1 } (kafka.admin.AdminUtils$)

Though there are no topics created on the cluster.

Has anyone faced a similar problem. How can I fix it.

Ashendra

Re: Kafka Issue [Corrupted broker]

Posted by Joe Stein <jo...@stealth.ly>.
It looks like broker 5 is in a bad state. You are likely going to have to
shut it down. From there you have a few options and depending on your
environment setup will dictate if you do shut it down and/or what you do
after that. Spinning up another server with broker.id == 5  and let
replication heal the topics that were durable is a way to go. If you do
that then you can go back to the old server and debug what went wrong and
recover the replication factor == 1 partition data (back it up) and fix
that later after you figure out what went wrong.

/*******************************************
 Joe Stein
 Founder, Principal Consultant
 Big Data Open Source Security LLC
 http://www.stealth.ly
 Twitter: @allthingshadoop <http://www.twitter.com/allthingshadoop>
********************************************/

On Tue, Dec 9, 2014 at 2:54 AM, ashendra bansal <as...@gmail.com>
wrote:

> Hi,
>
>     One of the broker seems to have got corrupted in my cluster of 7
> brokers. All the topic partitions where this broker was leader are having
> NoLeader or UnderReplicated partition exceptions.
>
> All these partittions have no leader and even no replica in the isr(in-sync
> replica) set.
>
> Corrupt broker id - 5.
>
> topic: topic1 partition: 2 leader: -1 replicas: 5 isr:
> topic: topic1 partition: 8 leader: -1 replicas: 5 isr:
> topic: topic1 partition: 14 leader: -1 replicas: 5 isr:
> topic: topic2 partition: 1 leader: -1 replicas: 5 isr:
> topic: topic2 partition: 8 leader: -1 replicas: 5 isr:
> topic: topic2 partition: 15 leader: -1 replicas: 5 isr:
> topic: topic3 partition: 1 leader: -1 replicas: 5 isr:
> topic: topic3 partition: 8 leader: -1 replicas: 5 isr:
> topic: topic3 partition: 15 leader: -1 replicas: 5 isr:
>
> I have tried the replication tools to manually assign broker to these
> partitions but that did not helped. As none of them are in isr set.
>
> Unfortunately the replication factor for these topics was 1. But for topics
> where the replication factor was higher, the problem persist. There the
> leader has been assigned to the next preferred replica but the replica on
> corrupt broker is not moved to isr set even after long time(days) and
> partitions have logs in order of 100s.
>
> topic: topic4 partition: 1 leader: 6 replicas: 5,6 isr: 6
>
> For same topic, the partition where leader was not broker 5(corrupted
> broker) there broker 5 is still in isr set.
>
> topic: topic4 partition: 0 leader: 4 replicas: 4,5 isr: 4,5
>
> Another observation, the corrupted broker has topic creation log in its
> INFO logs, printed very frequently, every minute
>
> [2014-12-09 13:07:27,878] INFO Topic creation { "partitions":{ "0":[ 4, 3
> ], "1":[ 5, 4 ] }, "version":1 } (kafka.admin.AdminUtils$)
>
> Though there are no topics created on the cluster.
>
> Has anyone faced a similar problem. How can I fix it.
>
> Ashendra
>