You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by "hsy541@gmail.com" <hs...@gmail.com> on 2013/10/11 08:08:02 UTC

Question about auto-rebalancing

Hi guys,

Here is a case I observed, I have a single-node 3 broker instance cluster.
I created 1 topic with 2 partitions and 2 replica for each partition. The
initial distribution is like this topic1/partition0 ->(broker0, broker2)
 topic1/partition1 ->(broker1,broker2). So broker0 is leader broker for
partition0 and broker1 is the leader broker for partition1.  I then kill
broker0, broker3 becomes leader broker for partition1, then I kill broker2,
broker1 becomes leader broker of both partition0 and partition1 which is
fine.  But when I restart broker0, broker2, after they synced with broker1,
they are just replica broker for partition0 and partition1. So my
consumers(simple consumer) really don't know which broker it should read
from. I found a command to that will force re-balance after failover, but
isn't there any automatic way to rebalance the lleader broker?

Best regards,
Siyuan

Re: Question about auto-rebalancing

Posted by "hsy541@gmail.com" <hs...@gmail.com>.
Oh, Sriram, Thank you very much!



On Fri, Oct 11, 2013 at 5:44 PM, Sriram Subramanian <
srsubramanian@linkedin.com> wrote:

> We already have a JIRA for auto rebalance. I would be working on this soon.
>
> KAFKA-930 <https://issues.apache.org/jira/browse/KAFKA-930>
>
>
>
> On 10/11/13 5:39 PM, "Guozhang Wang" <wa...@gmail.com> wrote:
>
> >Hello Siyuan,
> >
> >For the automatic leader re-election, yes we are considering to make it
> >work. Could you file a JIRA for this issue?
> >
> >For the high-level consumer's rebalancing logic, you can find it at
> >
> >
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-CanIpredictthere
> >sultsoftheconsumerrebabalance%3F
> >
> >Guozhang
> >
> >
> >On Fri, Oct 11, 2013 at 11:06 AM, hsy541@gmail.com <hs...@gmail.com>
> >wrote:
> >
> >> Hi Jun,
> >>
> >> Thanks for your reply, but in a real cluster, one broker could serve
> >> different topics and different partitions, the simple consumer only has
> >> knowledge of brokers that are available but it has no knowledge to
> >>decide
> >> which broker is best to pick up to consume messages.  If you don't
> >>choose
> >> carefully, multiple simple consumer might end up with reading from same
> >> node which is definitely not good for performance.
> >> Interesting thing is I find out there is
> >> command kafka-preferred-replica-election.sh which will try to equally
> >> distribute the leadership among different brokers, this is good that I
> >>can
> >> always let my simple consumer reads from leader broker(even it fails,
> >>the
> >> replica will pick up as leader which is fine).  But why don't kafka
> >>cluster
> >> run this command automatically when there is a broker change(up/down) in
> >> the cluster so that the leadership can always be equally distributed
> >>among
> >> different brokers ASAP?  I think it's very good for simple consumer to
> >> decide which broker is good to read from.
> >>
> >> Another question is I'm also curious how high-level consumer is
> >>balanced. I
> >> assume each high-level consumer know other consumers(int the same group)
> >> which broker they read message from and it can try to avoid those
> >>brokers
> >> and to pick up a free one?  Is there a document for the balancing rule
> >> among high-level consumer. Does it always guarantee that after several
> >> leadership change/temporary broker fail, it can always equally
> >>distribute
> >> the read among the brokers. Basically I think it's nice to have a API to
> >> let dev know which consumer reads from which broker otherwise I don't
> >>know
> >> anything behind the high-level consumer
> >>
> >> Thanks!
> >>
> >> Best,
> >> Siyuan
> >>
> >
> >
> >
> >--
> >-- Guozhang
>
>

Re: Question about auto-rebalancing

Posted by Sriram Subramanian <sr...@linkedin.com>.
We already have a JIRA for auto rebalance. I would be working on this soon.

KAFKA-930 <https://issues.apache.org/jira/browse/KAFKA-930>



On 10/11/13 5:39 PM, "Guozhang Wang" <wa...@gmail.com> wrote:

>Hello Siyuan,
>
>For the automatic leader re-election, yes we are considering to make it
>work. Could you file a JIRA for this issue?
>
>For the high-level consumer's rebalancing logic, you can find it at
>
>https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-CanIpredictthere
>sultsoftheconsumerrebabalance%3F
>
>Guozhang
>
>
>On Fri, Oct 11, 2013 at 11:06 AM, hsy541@gmail.com <hs...@gmail.com>
>wrote:
>
>> Hi Jun,
>>
>> Thanks for your reply, but in a real cluster, one broker could serve
>> different topics and different partitions, the simple consumer only has
>> knowledge of brokers that are available but it has no knowledge to
>>decide
>> which broker is best to pick up to consume messages.  If you don't
>>choose
>> carefully, multiple simple consumer might end up with reading from same
>> node which is definitely not good for performance.
>> Interesting thing is I find out there is
>> command kafka-preferred-replica-election.sh which will try to equally
>> distribute the leadership among different brokers, this is good that I
>>can
>> always let my simple consumer reads from leader broker(even it fails,
>>the
>> replica will pick up as leader which is fine).  But why don't kafka
>>cluster
>> run this command automatically when there is a broker change(up/down) in
>> the cluster so that the leadership can always be equally distributed
>>among
>> different brokers ASAP?  I think it's very good for simple consumer to
>> decide which broker is good to read from.
>>
>> Another question is I'm also curious how high-level consumer is
>>balanced. I
>> assume each high-level consumer know other consumers(int the same group)
>> which broker they read message from and it can try to avoid those
>>brokers
>> and to pick up a free one?  Is there a document for the balancing rule
>> among high-level consumer. Does it always guarantee that after several
>> leadership change/temporary broker fail, it can always equally
>>distribute
>> the read among the brokers. Basically I think it's nice to have a API to
>> let dev know which consumer reads from which broker otherwise I don't
>>know
>> anything behind the high-level consumer
>>
>> Thanks!
>>
>> Best,
>> Siyuan
>>
>
>
>
>-- 
>-- Guozhang


Re: Question about auto-rebalancing

Posted by Guozhang Wang <wa...@gmail.com>.
Hello Siyuan,

For the automatic leader re-election, yes we are considering to make it
work. Could you file a JIRA for this issue?

For the high-level consumer's rebalancing logic, you can find it at

https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-CanIpredicttheresultsoftheconsumerrebabalance%3F

Guozhang


On Fri, Oct 11, 2013 at 11:06 AM, hsy541@gmail.com <hs...@gmail.com> wrote:

> Hi Jun,
>
> Thanks for your reply, but in a real cluster, one broker could serve
> different topics and different partitions, the simple consumer only has
> knowledge of brokers that are available but it has no knowledge to decide
> which broker is best to pick up to consume messages.  If you don't choose
> carefully, multiple simple consumer might end up with reading from same
> node which is definitely not good for performance.
> Interesting thing is I find out there is
> command kafka-preferred-replica-election.sh which will try to equally
> distribute the leadership among different brokers, this is good that I can
> always let my simple consumer reads from leader broker(even it fails, the
> replica will pick up as leader which is fine).  But why don't kafka cluster
> run this command automatically when there is a broker change(up/down) in
> the cluster so that the leadership can always be equally distributed among
> different brokers ASAP?  I think it's very good for simple consumer to
> decide which broker is good to read from.
>
> Another question is I'm also curious how high-level consumer is balanced. I
> assume each high-level consumer know other consumers(int the same group)
> which broker they read message from and it can try to avoid those brokers
> and to pick up a free one?  Is there a document for the balancing rule
> among high-level consumer. Does it always guarantee that after several
> leadership change/temporary broker fail, it can always equally distribute
> the read among the brokers. Basically I think it's nice to have a API to
> let dev know which consumer reads from which broker otherwise I don't know
> anything behind the high-level consumer
>
> Thanks!
>
> Best,
> Siyuan
>



-- 
-- Guozhang

Re: Question about auto-rebalancing

Posted by "hsy541@gmail.com" <hs...@gmail.com>.
Hi Jun,

Thanks for your reply, but in a real cluster, one broker could serve
different topics and different partitions, the simple consumer only has
knowledge of brokers that are available but it has no knowledge to decide
which broker is best to pick up to consume messages.  If you don't choose
carefully, multiple simple consumer might end up with reading from same
node which is definitely not good for performance.
Interesting thing is I find out there is
command kafka-preferred-replica-election.sh which will try to equally
distribute the leadership among different brokers, this is good that I can
always let my simple consumer reads from leader broker(even it fails, the
replica will pick up as leader which is fine).  But why don't kafka cluster
run this command automatically when there is a broker change(up/down) in
the cluster so that the leadership can always be equally distributed among
different brokers ASAP?  I think it's very good for simple consumer to
decide which broker is good to read from.

Another question is I'm also curious how high-level consumer is balanced. I
assume each high-level consumer know other consumers(int the same group)
which broker they read message from and it can try to avoid those brokers
and to pick up a free one?  Is there a document for the balancing rule
among high-level consumer. Does it always guarantee that after several
leadership change/temporary broker fail, it can always equally distribute
the read among the brokers. Basically I think it's nice to have a API to
let dev know which consumer reads from which broker otherwise I don't know
anything behind the high-level consumer

Thanks!

Best,
Siyuan

Re: Question about auto-rebalancing

Posted by Jun Rao <ju...@gmail.com>.
If you are using simple consumer, you are responsible for dealing with
leader replica changes. When the leader changes, an error code will be
returned in the fetch response and you need to refresh metadata and retry
the fetch request to the new leader. For details, see
https://cwiki.apache.org/confluence/display/KAFKA/0.8.0+SimpleConsumer+Example

Thanks,

Jun


On Thu, Oct 10, 2013 at 11:08 PM, hsy541@gmail.com <hs...@gmail.com> wrote:

> Hi guys,
>
> Here is a case I observed, I have a single-node 3 broker instance cluster.
> I created 1 topic with 2 partitions and 2 replica for each partition. The
> initial distribution is like this topic1/partition0 ->(broker0, broker2)
>  topic1/partition1 ->(broker1,broker2). So broker0 is leader broker for
> partition0 and broker1 is the leader broker for partition1.  I then kill
> broker0, broker3 becomes leader broker for partition1, then I kill broker2,
> broker1 becomes leader broker of both partition0 and partition1 which is
> fine.  But when I restart broker0, broker2, after they synced with broker1,
> they are just replica broker for partition0 and partition1. So my
> consumers(simple consumer) really don't know which broker it should read
> from. I found a command to that will force re-balance after failover, but
> isn't there any automatic way to rebalance the lleader broker?
>
> Best regards,
> Siyuan
>