You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Raju Bairishetti <ra...@apache.org> on 2016/02/22 04:43:28 UTC

Leader was set to -1 for some topic partitions

Hello,
   We are using 0.8.2 kafka version. We are running 5 brokers in the prod
cluster. Each topic is having two partitions. We are seeing some issues
with some topic partitions.

For some topic partitions leader was set to -1. I am not seeing any errors
in the controller and server logs. After server restart leader was set to
some topic partitions. Will it be a data loss of that topic partition.
Looks like, there is no data loss according to my application metrics but I
do not have any server logs to prove it from kafka side.

*kafka-topics --zookeeper localhost:2181 --describe --topic click_json*

Topic: click_json PartitionCount:2 ReplicationFactor:3
Configs:retention.bytes=42949672960

Topic: click_json Partition: 0 Leader: 4 Replicas: 4,5,1 Isr: 4,1,5

Topic: click_json *Partition: 1 Leader: -1 *Replicas: 5,1,2 *Isr: *


*Why leader was set to -1?*

*What is the impact in case if leader was set to -1?*

*How to recover from this error? Which option would be better --> Restart
of broker or choosing leader by running prefer leader election script?


FYI, we have set unclean.leader.election.enable to false on 3 machines and
unclean.leader.election.enable to true on 2 machines.


Thanks in advance!!!

------
Thanks,
Raju Bairishetti,
www.lazada.com

Re: Leader was set to -1 for some topic partitions

Posted by Raju Bairishetti <ra...@gmail.com>.
After enabling unclean leader election then all topic partitions are having
leaders. Enabling unclean leader election may cause slight data loss but
saves from lots of offline partitions.

My assumptions towards solving issue:
Earlier, we have enabled controlled shutdown and disabled unclean leader
election. Seems like, when we shutdown the broker(assume *broker1*),
leadership is transferred to other broker(*broker2*) as controlled shutdown
was enabled. Data was sent to the new leader(*broker2*) of that topic
partition. After restart of whole cluster, controller was trying to choose
the preferred replica as a leader(which is *broker1*) but that was lot
beyond actual leader(broker2) before the restart of cluster. As we have
disabled unclean leader election, controller is not able to elect the
preferred replica as leader.

After enabling  unclean leader election, all partitions became online
partitions.

*Seems like, enabling controller shutdown and disabling unclean leader
election together is causing multiple offline partitions in the cluster.*



On Thu, Feb 25, 2016 at 11:46 AM, Raju Bairishetti <ra...@gmail.com>
wrote:

> Any thoughts on this?
>
> On Mon, Feb 22, 2016 at 1:24 PM, Raju Bairishetti <ra...@gmail.com>
> wrote:
>
>>
>>
>> On Mon, Feb 22, 2016 at 1:13 PM, Salman Ahmed <ah...@gmail.com>
>> wrote:
>>
>>> We saw a similar issue a while back. If leader is -1, I believe you won't
>>> have ingestion work for that partition. Was there any data ingestion dip?
>>>
>>
>>
>> *No,  I am not seeing any data dip for topic but no data for that
>> partition whose leader was set to -1.*
>>
>> On Sun, Feb 21, 2016 at 7:44 PM Raju Bairishetti <ra...@apache.org> wrote:
>>>
>>> > Hello,
>>> >    We are using 0.8.2 kafka version. We are running 5 brokers in the
>>> prod
>>> > cluster. Each topic is having two partitions. We are seeing some issues
>>> > with some topic partitions.
>>> >
>>> > For some topic partitions leader was set to -1. I am not seeing any
>>> errors
>>> > in the controller and server logs. After server restart leader was set
>>> to
>>> > some topic partitions. Will it be a data loss of that topic partition.
>>> > Looks like, there is no data loss according to my application metrics
>>> but I
>>> > do not have any server logs to prove it from kafka side.
>>> >
>>> > *kafka-topics --zookeeper localhost:2181 --describe --topic click_json*
>>> >
>>> > Topic: click_json PartitionCount:2 ReplicationFactor:3
>>> > Configs:retention.bytes=42949672960
>>> >
>>> > Topic: click_json Partition: 0 Leader: 4 Replicas: 4,5,1 Isr: 4,1,5
>>> >
>>> > Topic: click_json *Partition: 1 Leader: -1 *Replicas: 5,1,2 *Isr: *
>>> >
>>> >
>>> > *Why leader was set to -1?*
>>> >
>>> > *What is the impact in case if leader was set to -1?*
>>> >
>>> > *How to recover from this error? Which option would be better -->
>>> Restart
>>> > of broker or choosing leader by running prefer leader election script?
>>> >
>>> >
>>> > FYI, we have set unclean.leader.election.enable to false on 3 machines
>>> and
>>> > unclean.leader.election.enable to true on 2 machines.
>>> >
>>> >
>>> > Thanks in advance!!!
>>> >
>>> > ------
>>> > Thanks,
>>> > Raju Bairishetti,
>>> > www.lazada.com
>>> >
>>>
>>
>>
>>
>> --
>> Thanks,
>> Raju Bairishetti,
>>
>> www.lazada.com
>>
>>
>>
>
>
> --
> Thanks,
> Raju Bairishetti,
>
> www.lazada.com
>
>
>


-- 
Thanks,
Raju Bairishetti,

www.lazada.com

Re: Leader was set to -1 for some topic partitions

Posted by Raju Bairishetti <ra...@gmail.com>.
Any thoughts on this?

On Mon, Feb 22, 2016 at 1:24 PM, Raju Bairishetti <ra...@gmail.com>
wrote:

>
>
> On Mon, Feb 22, 2016 at 1:13 PM, Salman Ahmed <ah...@gmail.com>
> wrote:
>
>> We saw a similar issue a while back. If leader is -1, I believe you won't
>> have ingestion work for that partition. Was there any data ingestion dip?
>>
>
>
> *No,  I am not seeing any data dip for topic but no data for that
> partition whose leader was set to -1.*
>
> On Sun, Feb 21, 2016 at 7:44 PM Raju Bairishetti <ra...@apache.org> wrote:
>>
>> > Hello,
>> >    We are using 0.8.2 kafka version. We are running 5 brokers in the
>> prod
>> > cluster. Each topic is having two partitions. We are seeing some issues
>> > with some topic partitions.
>> >
>> > For some topic partitions leader was set to -1. I am not seeing any
>> errors
>> > in the controller and server logs. After server restart leader was set
>> to
>> > some topic partitions. Will it be a data loss of that topic partition.
>> > Looks like, there is no data loss according to my application metrics
>> but I
>> > do not have any server logs to prove it from kafka side.
>> >
>> > *kafka-topics --zookeeper localhost:2181 --describe --topic click_json*
>> >
>> > Topic: click_json PartitionCount:2 ReplicationFactor:3
>> > Configs:retention.bytes=42949672960
>> >
>> > Topic: click_json Partition: 0 Leader: 4 Replicas: 4,5,1 Isr: 4,1,5
>> >
>> > Topic: click_json *Partition: 1 Leader: -1 *Replicas: 5,1,2 *Isr: *
>> >
>> >
>> > *Why leader was set to -1?*
>> >
>> > *What is the impact in case if leader was set to -1?*
>> >
>> > *How to recover from this error? Which option would be better -->
>> Restart
>> > of broker or choosing leader by running prefer leader election script?
>> >
>> >
>> > FYI, we have set unclean.leader.election.enable to false on 3 machines
>> and
>> > unclean.leader.election.enable to true on 2 machines.
>> >
>> >
>> > Thanks in advance!!!
>> >
>> > ------
>> > Thanks,
>> > Raju Bairishetti,
>> > www.lazada.com
>> >
>>
>
>
>
> --
> Thanks,
> Raju Bairishetti,
>
> www.lazada.com
>
>
>


-- 
Thanks,
Raju Bairishetti,

www.lazada.com

Re: Leader was set to -1 for some topic partitions

Posted by Raju Bairishetti <ra...@gmail.com>.
On Mon, Feb 22, 2016 at 1:13 PM, Salman Ahmed <ah...@gmail.com>
wrote:

> We saw a similar issue a while back. If leader is -1, I believe you won't
> have ingestion work for that partition. Was there any data ingestion dip?
>


*No,  I am not seeing any data dip for topic but no data for that partition
whose leader was set to -1.*

On Sun, Feb 21, 2016 at 7:44 PM Raju Bairishetti <ra...@apache.org> wrote:
>
> > Hello,
> >    We are using 0.8.2 kafka version. We are running 5 brokers in the prod
> > cluster. Each topic is having two partitions. We are seeing some issues
> > with some topic partitions.
> >
> > For some topic partitions leader was set to -1. I am not seeing any
> errors
> > in the controller and server logs. After server restart leader was set to
> > some topic partitions. Will it be a data loss of that topic partition.
> > Looks like, there is no data loss according to my application metrics
> but I
> > do not have any server logs to prove it from kafka side.
> >
> > *kafka-topics --zookeeper localhost:2181 --describe --topic click_json*
> >
> > Topic: click_json PartitionCount:2 ReplicationFactor:3
> > Configs:retention.bytes=42949672960
> >
> > Topic: click_json Partition: 0 Leader: 4 Replicas: 4,5,1 Isr: 4,1,5
> >
> > Topic: click_json *Partition: 1 Leader: -1 *Replicas: 5,1,2 *Isr: *
> >
> >
> > *Why leader was set to -1?*
> >
> > *What is the impact in case if leader was set to -1?*
> >
> > *How to recover from this error? Which option would be better --> Restart
> > of broker or choosing leader by running prefer leader election script?
> >
> >
> > FYI, we have set unclean.leader.election.enable to false on 3 machines
> and
> > unclean.leader.election.enable to true on 2 machines.
> >
> >
> > Thanks in advance!!!
> >
> > ------
> > Thanks,
> > Raju Bairishetti,
> > www.lazada.com
> >
>



-- 
Thanks,
Raju Bairishetti,

www.lazada.com

Re: Leader was set to -1 for some topic partitions

Posted by Salman Ahmed <ah...@gmail.com>.
We saw a similar issue a while back. If leader is -1, I believe you won't
have ingestion work for that partition. Was there any data ingestion dip?
On Sun, Feb 21, 2016 at 7:44 PM Raju Bairishetti <ra...@apache.org> wrote:

> Hello,
>    We are using 0.8.2 kafka version. We are running 5 brokers in the prod
> cluster. Each topic is having two partitions. We are seeing some issues
> with some topic partitions.
>
> For some topic partitions leader was set to -1. I am not seeing any errors
> in the controller and server logs. After server restart leader was set to
> some topic partitions. Will it be a data loss of that topic partition.
> Looks like, there is no data loss according to my application metrics but I
> do not have any server logs to prove it from kafka side.
>
> *kafka-topics --zookeeper localhost:2181 --describe --topic click_json*
>
> Topic: click_json PartitionCount:2 ReplicationFactor:3
> Configs:retention.bytes=42949672960
>
> Topic: click_json Partition: 0 Leader: 4 Replicas: 4,5,1 Isr: 4,1,5
>
> Topic: click_json *Partition: 1 Leader: -1 *Replicas: 5,1,2 *Isr: *
>
>
> *Why leader was set to -1?*
>
> *What is the impact in case if leader was set to -1?*
>
> *How to recover from this error? Which option would be better --> Restart
> of broker or choosing leader by running prefer leader election script?
>
>
> FYI, we have set unclean.leader.election.enable to false on 3 machines and
> unclean.leader.election.enable to true on 2 machines.
>
>
> Thanks in advance!!!
>
> ------
> Thanks,
> Raju Bairishetti,
> www.lazada.com
>