You are viewing a plain text version of this content. The canonical link for it is here.
Posted to users@kafka.apache.org by Siva Ram <si...@gmail.com> on 2018/07/24 01:19:38 UTC

Kafka Streams throughput performance overtime

 Hi,

I have a stream application that performs rollup from 15mins to Hourly,
then Hourly to Daily.  The process needs to be continuously run 24 hours
and each 15 mins approx 12 million records gets posted (a JSON record per
message) into the input topic. There are 3 separate processors
corresponding to the above, where Hourly and Daily maintains the state.
So, in hourly 10million needs to be retained ever hour in the state and in
daily overall 10 million.  42GB of memory is allocated for the whole
application and throughput is fine for until first 10hrs, after that it
degrades significantly.  Any suggestions on this to identify the delay and
to increase the throughput would be of great help?  We are on Kafka 1.0.0

Thanks
Siva

Re: Kafka Streams throughput performance overtime

Posted by Guozhang Wang <wa...@gmail.com>.
Hello Siva,

I'd suggest you upgrade to Kafka 2.0 once it is released (should be out
soon, probably this week) as it includes a critical performance
optimization for windowed aggregation operations.

Note that even if your broker is in older versions, new versioned clients
like Streams can still talk to them:
https://www.confluent.io/blog/upgrading-apache-kafka-clients-just-got-easier/

As long as your brokers are in version 0.10.1+, you can use any newer
version of Kafka Streams.


Guozhang


On Mon, Jul 23, 2018 at 6:19 PM, Siva Ram <si...@gmail.com> wrote:

>  Hi,
>
> I have a stream application that performs rollup from 15mins to Hourly,
> then Hourly to Daily.  The process needs to be continuously run 24 hours
> and each 15 mins approx 12 million records gets posted (a JSON record per
> message) into the input topic. There are 3 separate processors
> corresponding to the above, where Hourly and Daily maintains the state.
> So, in hourly 10million needs to be retained ever hour in the state and in
> daily overall 10 million.  42GB of memory is allocated for the whole
> application and throughput is fine for until first 10hrs, after that it
> degrades significantly.  Any suggestions on this to identify the delay and
> to increase the throughput would be of great help?  We are on Kafka 1.0.0
>
> Thanks
> Siva
>



-- 
-- Guozhang