You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Kevin Scaldeferri <ke...@scaldeferri.com> on 2015/10/28 18:36:15 UTC

Consumer with more stable partition assignments?

One of the big issues we run into with the 0.8 high-level consumer is the
instability of partition assignments during a rebalance.  The simple
lexicographic assignment strategy means that if one consumer instance dies
(and potentially a new instance with a different consumerId gets spun up to
replace it) that every other consumer instance potentially changes its
partition assignments.  This is bad for data locality, cache-hit-rates, etc.

We'd really like it if under a rebalance consumers kept the same partition
assignments as much as possible.  So, if one instance goes away, another
will have to take over that partition, but there's no reason that large
numbers of consumers in the group should have to change what partition they
consumer.  Similarly, if a new instance is added, other consumers might
relinquish excess partitions to maintain good balance of work-load across
the group, but it would be nice if we didn't shuffle everything around.

I've been working on an algorithm and implementation to support this, but
before getting too far down the path, I wanted to check and see if anyone
has previously developed a consumer that works like this.  So far I haven't
had any luck searching for one.

Also, in reading the 0.9 consumer docs, there are some hints that
rebalancing / partition assignment will work differently from the current
assignment strategy (and, in particular, that it might be possible to plug
in user-defined strategies).  I'd love to know a little more about this,
and if it might be simpler to achieve the behavior we're looking for with
0.9.

Thanks,
Kevin

Re: Consumer with more stable partition assignments?

Posted by Joey Echeverria <jo...@rocana.com>.
I don't have a solution, but I thought I'd chime in with interest in finding a solution to this problem. We have a use case where we're partitioning the dataset we write to according to Kafka partitions and having to close all writers and re-open after a rebalance is a pain point. 

-Joey

> On Oct 28, 2015, at 13:36, Kevin Scaldeferri <ke...@scaldeferri.com> wrote:
> 
> One of the big issues we run into with the 0.8 high-level consumer is the
> instability of partition assignments during a rebalance.  The simple
> lexicographic assignment strategy means that if one consumer instance dies
> (and potentially a new instance with a different consumerId gets spun up to
> replace it) that every other consumer instance potentially changes its
> partition assignments.  This is bad for data locality, cache-hit-rates, etc.
> 
> We'd really like it if under a rebalance consumers kept the same partition
> assignments as much as possible.  So, if one instance goes away, another
> will have to take over that partition, but there's no reason that large
> numbers of consumers in the group should have to change what partition they
> consumer.  Similarly, if a new instance is added, other consumers might
> relinquish excess partitions to maintain good balance of work-load across
> the group, but it would be nice if we didn't shuffle everything around.
> 
> I've been working on an algorithm and implementation to support this, but
> before getting too far down the path, I wanted to check and see if anyone
> has previously developed a consumer that works like this.  So far I haven't
> had any luck searching for one.
> 
> Also, in reading the 0.9 consumer docs, there are some hints that
> rebalancing / partition assignment will work differently from the current
> assignment strategy (and, in particular, that it might be possible to plug
> in user-defined strategies).  I'd love to know a little more about this,
> and if it might be simpler to achieve the behavior we're looking for with
> 0.9.
> 
> Thanks,
> Kevin