You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Puneet Lakhina <pu...@gmail.com> on 2018/01/19 20:30:03 UTC

Kafka Streams and Scaling Brokers

Hi,

We have a small kafka cluster (3 broker nodes) and as our kafka usage has
grown we are looking to add more brokers. In order for the new brokers to
take on the load of some of the existing topics, I assume we have to either
add partitions that are assigned to these new broker nodes or we have to
move some existing partitions to the new nodes.

We are using kafka streams for some aggregate operations(groupByKey
followed by aggregate) and I wanted to understand how to reason about the
impact of adding/moving partitions on the streams application would be.
Would it be advisable to stop the streams applications, add/move the
nodes/partitions and then resume? Would the preferred method be to move
partitions rather than add in order to maintain the accuracy of the
aggregate operations of kafka streams?

Thanks!

-- 
Regards,
Puneet

Re: Kafka Streams and Scaling Brokers

Posted by Matt Farmer <ma...@frmr.me>.
We recently scaled up the number of brokers we had in our cluster. Instead of adding partitions we just reassigned the partitions to distributed them better across all the brokers we now had. We did this for internal streams topics, too, and things went pretty smoothly.

You can find documentation on how to do that here: https://kafka.apache.org/documentation/#basic_ops_automigrate <https://kafka.apache.org/documentation/#basic_ops_automigrate>

You may find this a more beneficial starting point rather than just adding partitions. But! YMMV!

Matt

> On Jan 19, 2018, at 3:30 PM, Puneet Lakhina <pu...@gmail.com> wrote:
> 
> Hi,
> 
> We have a small kafka cluster (3 broker nodes) and as our kafka usage has
> grown we are looking to add more brokers. In order for the new brokers to
> take on the load of some of the existing topics, I assume we have to either
> add partitions that are assigned to these new broker nodes or we have to
> move some existing partitions to the new nodes.
> 
> We are using kafka streams for some aggregate operations(groupByKey
> followed by aggregate) and I wanted to understand how to reason about the
> impact of adding/moving partitions on the streams application would be.
> Would it be advisable to stop the streams applications, add/move the
> nodes/partitions and then resume? Would the preferred method be to move
> partitions rather than add in order to maintain the accuracy of the
> aggregate operations of kafka streams?
> 
> Thanks!
> 
> -- 
> Regards,
> Puneet