You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Dennis Haller <dh...@talemetry.com> on 2013/08/22 17:53:46 UTC

How does Kafka decide which Consumer out of multiple Consumer clients to assign to a single topic partition

I have a situation where two high level consumers are being created to
consume a single topic. There is only one partition for the topic, so I
understand that only one Consumer will end up owning the topic and
receiving messages. The two consumers are created from two servers in a
redundant master - slave configuration, and it is our intention that the
servers should start in the same configuration predictably, with all the
Consumers active on the master server.

However, we find after both Consumers have been created that sometimes the
first Consumer client succeeds in owning the topic and sometimes it is the
second. I first thought that the first Consumer  client to register with
the topic would be retained even if subsequent Consumers also register for
the topic, but I see sometimes the second Consumer client succeeds in
replacing the first Consumer.

The logs show a rebalancing algorithm working after each Consumer is
registered.

In this case, where there is only one topic-partition, is it possible to
predict what Consumer client will own the topic? How is that rebalancing
done?

Thanks
Dennis

Re: How does Kafka decide which Consumer out of multiple Consumer clients to assign to a single topic partition

Posted by Dennis Haller <dh...@talemetry.com>.
The sorted list makes sense with what I'm seeing.

The consumers are named with the group-name appended with the server
hostname plus some other string such as this:
redis-indexer_ip-10-122-123-214.ec2.internal-1377184713770-a6aa2f8e-0

Because in AWS the hostname is coming up with a different internal ip
address  each time, then the consumer instances will sort in different
order with respect to the machine names from one deployment to another.

Dennis



On Thu, Aug 22, 2013 at 9:31 AM, Guozhang Wang <wa...@gmail.com> wrote:

> Hello Dennis,
>
> The rebalance on each consumer works by first release their owned
> partitions first (releasePartitionOwnership in
> ZookeeperConsumerConnector.scala) and then compute the new ownership. Hence
> in your scenario it is equally possible for each one of the two consumers
> to own the partition in each rebalance process. And it is  not possible to
> predict which consumer will claim the ownership of the partition.
>
> Guozhang
>
>
> On Thu, Aug 22, 2013 at 8:53 AM, Dennis Haller <dhaller@talemetry.com
> >wrote:
>
> > I have a situation where two high level consumers are being created to
> > consume a single topic. There is only one partition for the topic, so I
> > understand that only one Consumer will end up owning the topic and
> > receiving messages. The two consumers are created from two servers in a
> > redundant master - slave configuration, and it is our intention that the
> > servers should start in the same configuration predictably, with all the
> > Consumers active on the master server.
> >
> > However, we find after both Consumers have been created that sometimes
> the
> > first Consumer client succeeds in owning the topic and sometimes it is
> the
> > second. I first thought that the first Consumer  client to register with
> > the topic would be retained even if subsequent Consumers also register
> for
> > the topic, but I see sometimes the second Consumer client succeeds in
> > replacing the first Consumer.
> >
> > The logs show a rebalancing algorithm working after each Consumer is
> > registered.
> >
> > In this case, where there is only one topic-partition, is it possible to
> > predict what Consumer client will own the topic? How is that rebalancing
> > done?
> >
> > Thanks
> > Dennis
> >
>
>
>
> --
> -- Guozhang
>

Re: How does Kafka decide which Consumer out of multiple Consumer clients to assign to a single topic partition

Posted by Guozhang Wang <wa...@gmail.com>.
Hello Dennis,

The rebalance on each consumer works by first release their owned
partitions first (releasePartitionOwnership in
ZookeeperConsumerConnector.scala) and then compute the new ownership. Hence
in your scenario it is equally possible for each one of the two consumers
to own the partition in each rebalance process. And it is  not possible to
predict which consumer will claim the ownership of the partition.

Guozhang


On Thu, Aug 22, 2013 at 8:53 AM, Dennis Haller <dh...@talemetry.com>wrote:

> I have a situation where two high level consumers are being created to
> consume a single topic. There is only one partition for the topic, so I
> understand that only one Consumer will end up owning the topic and
> receiving messages. The two consumers are created from two servers in a
> redundant master - slave configuration, and it is our intention that the
> servers should start in the same configuration predictably, with all the
> Consumers active on the master server.
>
> However, we find after both Consumers have been created that sometimes the
> first Consumer client succeeds in owning the topic and sometimes it is the
> second. I first thought that the first Consumer  client to register with
> the topic would be retained even if subsequent Consumers also register for
> the topic, but I see sometimes the second Consumer client succeeds in
> replacing the first Consumer.
>
> The logs show a rebalancing algorithm working after each Consumer is
> registered.
>
> In this case, where there is only one topic-partition, is it possible to
> predict what Consumer client will own the topic? How is that rebalancing
> done?
>
> Thanks
> Dennis
>



-- 
-- Guozhang

Re: How does Kafka decide which Consumer out of multiple Consumer clients to assign to a single topic partition

Posted by Dennis Haller <dh...@talemetry.com>.
Thanks Guozhang. It's good to have this in the FAQ.



On Thu, Aug 22, 2013 at 10:22 AM, Guozhang Wang <wa...@gmail.com> wrote:

> Thanks Neha for the clarification. I have created a new entry in FAQ for
> this question:
>
>
> https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-CanIpredicttheresultsoftheconsumerrebabalance%3F
>
> Dennis, please let me know if this does not fully answer your question.
>
> Guozhang
>
>
> On Thu, Aug 22, 2013 at 9:29 AM, Neha Narkhede <neha.narkhede@gmail.com
> >wrote:
>
> > We range partition a sorted list of topic-partitions over a sorted list
> of
> > consumer instances. This makes the rebalancing algorithm deterministic.
> As
> > soon as you bring up the 2nd consumer instance, if its position in the
> > sorted list is before the position of the 1st consumer client, it will
> end
> > up owning the partition.
> >
> > Thanks,
> > Neha
> >
> >
> > On Thu, Aug 22, 2013 at 8:53 AM, Dennis Haller <dhaller@talemetry.com
> > >wrote:
> >
> > > I have a situation where two high level consumers are being created to
> > > consume a single topic. There is only one partition for the topic, so I
> > > understand that only one Consumer will end up owning the topic and
> > > receiving messages. The two consumers are created from two servers in a
> > > redundant master - slave configuration, and it is our intention that
> the
> > > servers should start in the same configuration predictably, with all
> the
> > > Consumers active on the master server.
> > >
> > > However, we find after both Consumers have been created that sometimes
> > the
> > > first Consumer client succeeds in owning the topic and sometimes it is
> > the
> > > second. I first thought that the first Consumer  client to register
> with
> > > the topic would be retained even if subsequent Consumers also register
> > for
> > > the topic, but I see sometimes the second Consumer client succeeds in
> > > replacing the first Consumer.
> > >
> > > The logs show a rebalancing algorithm working after each Consumer is
> > > registered.
> > >
> > > In this case, where there is only one topic-partition, is it possible
> to
> > > predict what Consumer client will own the topic? How is that
> rebalancing
> > > done?
> > >
> > > Thanks
> > > Dennis
> > >
> >
>
>
>
> --
> -- Guozhang
>

Re: How does Kafka decide which Consumer out of multiple Consumer clients to assign to a single topic partition

Posted by Guozhang Wang <wa...@gmail.com>.
Thanks Neha for the clarification. I have created a new entry in FAQ for
this question:

https://cwiki.apache.org/confluence/display/KAFKA/FAQ#FAQ-CanIpredicttheresultsoftheconsumerrebabalance%3F

Dennis, please let me know if this does not fully answer your question.

Guozhang


On Thu, Aug 22, 2013 at 9:29 AM, Neha Narkhede <ne...@gmail.com>wrote:

> We range partition a sorted list of topic-partitions over a sorted list of
> consumer instances. This makes the rebalancing algorithm deterministic. As
> soon as you bring up the 2nd consumer instance, if its position in the
> sorted list is before the position of the 1st consumer client, it will end
> up owning the partition.
>
> Thanks,
> Neha
>
>
> On Thu, Aug 22, 2013 at 8:53 AM, Dennis Haller <dhaller@talemetry.com
> >wrote:
>
> > I have a situation where two high level consumers are being created to
> > consume a single topic. There is only one partition for the topic, so I
> > understand that only one Consumer will end up owning the topic and
> > receiving messages. The two consumers are created from two servers in a
> > redundant master - slave configuration, and it is our intention that the
> > servers should start in the same configuration predictably, with all the
> > Consumers active on the master server.
> >
> > However, we find after both Consumers have been created that sometimes
> the
> > first Consumer client succeeds in owning the topic and sometimes it is
> the
> > second. I first thought that the first Consumer  client to register with
> > the topic would be retained even if subsequent Consumers also register
> for
> > the topic, but I see sometimes the second Consumer client succeeds in
> > replacing the first Consumer.
> >
> > The logs show a rebalancing algorithm working after each Consumer is
> > registered.
> >
> > In this case, where there is only one topic-partition, is it possible to
> > predict what Consumer client will own the topic? How is that rebalancing
> > done?
> >
> > Thanks
> > Dennis
> >
>



-- 
-- Guozhang

Re: How does Kafka decide which Consumer out of multiple Consumer clients to assign to a single topic partition

Posted by Neha Narkhede <ne...@gmail.com>.
We range partition a sorted list of topic-partitions over a sorted list of
consumer instances. This makes the rebalancing algorithm deterministic. As
soon as you bring up the 2nd consumer instance, if its position in the
sorted list is before the position of the 1st consumer client, it will end
up owning the partition.

Thanks,
Neha


On Thu, Aug 22, 2013 at 8:53 AM, Dennis Haller <dh...@talemetry.com>wrote:

> I have a situation where two high level consumers are being created to
> consume a single topic. There is only one partition for the topic, so I
> understand that only one Consumer will end up owning the topic and
> receiving messages. The two consumers are created from two servers in a
> redundant master - slave configuration, and it is our intention that the
> servers should start in the same configuration predictably, with all the
> Consumers active on the master server.
>
> However, we find after both Consumers have been created that sometimes the
> first Consumer client succeeds in owning the topic and sometimes it is the
> second. I first thought that the first Consumer  client to register with
> the topic would be retained even if subsequent Consumers also register for
> the topic, but I see sometimes the second Consumer client succeeds in
> replacing the first Consumer.
>
> The logs show a rebalancing algorithm working after each Consumer is
> registered.
>
> In this case, where there is only one topic-partition, is it possible to
> predict what Consumer client will own the topic? How is that rebalancing
> done?
>
> Thanks
> Dennis
>