You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Wes Chow <we...@chartbeat.com> on 2014/11/18 17:48:35 UTC

partition auto-rebalance

I'm trying to understand the config options for auto-rebalancing. This 
is what we have in /etc/kafka/server.properties for all the nodes:

auto.leader.rebalance.enable=true
leader.imbalance.per.broker.percentage=10
leader.imbalance.check.interval.seconds=300

We have 10 nodes for this topic which has 512 partitions. They were 
evenly balanced before I started my experiment. I shut down two of the 
nodes, and the number of leaders per node is now:

      75 10
      68 3
      57 4
      67 5
      57 6
      68 7
      63 8
      57 9

Where the first column is # of leaders, and the second column is node #. 
You can see that nodes 1 and 2 have no leaders, since they're down. It's 
been about half an hour since I did this and the balancing hasn't changed.

The documentation on the config option is very ambiguous. My 
interpretation is that it says if any particular node has 10% more 
leaders then auto-rebalance kicks in. If that means 10% more than the 
average, then node #10 has 75 partitioners, and the average is 64, so 
that's a 17% difference.

So I think I'm misunderstanding either what auto-rebalance is supposed 
to do or the condition that's supposed to trigger it. Any clues?

Thanks,
Wes


Re: partition auto-rebalance

Posted by Joel Koshy <jj...@gmail.com>.
The imbalance is measured wrt preferred leaders. i.e., for every
partition, the first replica in the assigned replica list (as shown in
the output of kafka-topics.sh) is called the preferred replica. On
each broker, the auto-balancer counts the number of partitions led by
that broker for which the preferred leader is on another broker. If it
exceeds the thresholds it does a preferred replica leader election.

In your case, can you run the kafka-topics.sh script and see if the
preferred replicas are evenly distributed? Also, which version of
Kafka are you using?

Thanks,

Joel

On Tue, Nov 18, 2014 at 11:48:35AM -0500, Wes Chow wrote:
> I'm trying to understand the config options for auto-rebalancing.
> This is what we have in /etc/kafka/server.properties for all the
> nodes:
> 
> auto.leader.rebalance.enable=true
> leader.imbalance.per.broker.percentage=10
> leader.imbalance.check.interval.seconds=300
> 
> We have 10 nodes for this topic which has 512 partitions. They were
> evenly balanced before I started my experiment. I shut down two of
> the nodes, and the number of leaders per node is now:
> 
>      75 10
>      68 3
>      57 4
>      67 5
>      57 6
>      68 7
>      63 8
>      57 9
> 
> Where the first column is # of leaders, and the second column is
> node #. You can see that nodes 1 and 2 have no leaders, since
> they're down. It's been about half an hour since I did this and the
> balancing hasn't changed.
> 
> The documentation on the config option is very ambiguous. My
> interpretation is that it says if any particular node has 10% more
> leaders then auto-rebalance kicks in. If that means 10% more than
> the average, then node #10 has 75 partitioners, and the average is
> 64, so that's a 17% difference.
> 
> So I think I'm misunderstanding either what auto-rebalance is
> supposed to do or the condition that's supposed to trigger it. Any
> clues?
> 
> Thanks,
> Wes
> 


Re: partition auto-rebalance

Posted by Guozhang Wang <wa...@gmail.com>.
Hello Wes,

The document here is a bit misleading indeed:

http://kafka.apache.org/documentation.html#brokerconfigs

In Kafka a partition has a replica list {A,B,C..} and the first replica
would be the leader of the partition. When it is not the case, for example
since A is down B becomes the leader, the replica list will still be
{A,B,C..} but A's status will be "offline replica" and B as the new leader;
later on even when A resumes it will still be a follower, and hence this is
a case of "imbalance".

The "leader.imbalance.per.broker.percentage" kicks in when this percentage
of this imbalance cases are higher than the threshold. In your case those
imbalances cases will be more than 10%, but since those two brokers are not
back the rebalance logic, although triggered, will not be able to do
anything (you may check the controller logs for entries like ""Starting
preferred replica leader election for ..." to verify).

When you brought those two brokers base online, I think the auto leader
rebalance will execute to move the leaders back to those brokers.

Guozhang


On Tue, Nov 18, 2014 at 8:48 AM, Wes Chow <we...@chartbeat.com> wrote:

> I'm trying to understand the config options for auto-rebalancing. This is
> what we have in /etc/kafka/server.properties for all the nodes:
>
> auto.leader.rebalance.enable=true
> leader.imbalance.per.broker.percentage=10
> leader.imbalance.check.interval.seconds=300
>
> We have 10 nodes for this topic which has 512 partitions. They were evenly
> balanced before I started my experiment. I shut down two of the nodes, and
> the number of leaders per node is now:
>
>      75 10
>      68 3
>      57 4
>      67 5
>      57 6
>      68 7
>      63 8
>      57 9
>
> Where the first column is # of leaders, and the second column is node #.
> You can see that nodes 1 and 2 have no leaders, since they're down. It's
> been about half an hour since I did this and the balancing hasn't changed.
>
> The documentation on the config option is very ambiguous. My
> interpretation is that it says if any particular node has 10% more leaders
> then auto-rebalance kicks in. If that means 10% more than the average, then
> node #10 has 75 partitioners, and the average is 64, so that's a 17%
> difference.
>
> So I think I'm misunderstanding either what auto-rebalance is supposed to
> do or the condition that's supposed to trigger it. Any clues?
>
> Thanks,
> Wes
>
>


-- 
-- Guozhang