You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by andrea sposito <an...@gmail.com> on 2019/06/19 08:53:21 UTC

redistribute data partition

I'm relative new to Kafka and probably I'd have a miss understand on how
rebalancing works, in my case I've created a topic with 4 partition and
then I've publish some data.

To test our product we start to consume data with 4 clients with the same
consumer group.

For increase consumer throughput we extends our topic with 8 partition,
after restart clients we noticed that that only 4 clients continues to
consume instead what I waiting was that all 8 clients start to consume.

The official Kafka resource says "partitioning will potentially be shuffled
by adding partitions but Kafka will not attempt to automatically
redistribute data in any way", is there a way to redistribuite data
(already present in Kafka) on all partition topic ?

Thanks for help
Andrea

Re: redistribute data partition

Posted by "Matthias J. Sax" <ma...@confluent.io>.
You first mention that you start 4 consumers, and later that only 4 out
of 8 consumer read data. This is a little confusion.

About addition partitions and the quote from the docs: The quote is
about broker behavior. When you create a topic, for each topic partition
replica, a broker is selected to store the data. This mapping does not
change (it's possible to manually reassign partition replicas). If you
add new partitions, again for each new partition replica a broker will
be selected to store the data. Existing partition replicas are not
re-mapped to other brokers.

Pinning partitions to broker, has nothing to do with rebalancing though.
Rebalancing is a consumer concept. During a rebalance, it's decided
which consumer in the group is reading which partitions.

If you add new partitions to a topic, it may take some time until client
learn about the new partitions. When they learn about new partitions,
they will rebalance. Did you observe if a rebalance happened? By
default, clients refresh their metadata about existing topics and
partitions each 5 minutes -- maybe you just need to wait longer.


-Matthias

On 6/19/19 1:53 AM, andrea sposito wrote:
> I'm relative new to Kafka and probably I'd have a miss understand on how
> rebalancing works, in my case I've created a topic with 4 partition and
> then I've publish some data.
> 
> To test our product we start to consume data with 4 clients with the same
> consumer group.
> 
> For increase consumer throughput we extends our topic with 8 partition,
> after restart clients we noticed that that only 4 clients continues to
> consume instead what I waiting was that all 8 clients start to consume.
> 
> The official Kafka resource says "partitioning will potentially be shuffled
> by adding partitions but Kafka will not attempt to automatically
> redistribute data in any way", is there a way to redistribuite data
> (already present in Kafka) on all partition topic ?
> 
> Thanks for help
> Andrea
>