You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Joel Ohman <ma...@gmail.com> on 2015/06/24 04:34:43 UTC

Consumer rebalancing based on partition sizes?

Hello!

I'm working with a topic of largely variable partition sizes. My biggest
concern is that I have no control over which keys are assigned to which
consumers in my consumer group, as the amount of data my consumer sees is
directly reflected on it's work load. Is there a way to distribute
partitions to consumers evenly  based on the size of each partition? The
provided Consumer Rebalancing Algorithm prioritizes assigning consumers
even numbers of partitions, regardless of their size.

Regards,
Joel

Re: Consumer rebalancing based on partition sizes?

Posted by Ewen Cheslack-Postava <ew...@confluent.io>.
Current partition assignment only has a few limited options -- see the
partition.assignment.strategy consumer option (which seems to be listed
twice, see the second version for a more detailed explanation). There has
been some discussion of making assignment strategies user extensible to
support use cases like this.

Is there a reason your data is unbalanced that might be avoidable? Ideally
good hashing of keys combined with a large enough number of keys with
reasonable data distribution across keys (not necessarily uniform) leads to
a reasonable balance, although there are certainly some workloads that are
so skewed that this doesn't work out.



On Tue, Jun 23, 2015 at 7:34 PM, Joel Ohman <maelstrom.thunderbolt@gmail.com
> wrote:

> Hello!
>
> I'm working with a topic of largely variable partition sizes. My biggest
> concern is that I have no control over which keys are assigned to which
> consumers in my consumer group, as the amount of data my consumer sees is
> directly reflected on it's work load. Is there a way to distribute
> partitions to consumers evenly  based on the size of each partition? The
> provided Consumer Rebalancing Algorithm prioritizes assigning consumers
> even numbers of partitions, regardless of their size.
>
> Regards,
> Joel
>



-- 
Thanks,
Ewen